From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id mH1dAOT6Ml/EZwAA0tVLHw (envelope-from ) for ; Tue, 11 Aug 2020 20:09:08 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id c/6oNeP6Ml9PdgAA1q6Kng (envelope-from ) for ; Tue, 11 Aug 2020 20:09:07 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id A62C4940A5F for ; Tue, 11 Aug 2020 20:09:06 +0000 (UTC) Received: from localhost ([::1]:37256 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k5aa1-0000RU-Fe for larch@yhetil.org; Tue, 11 Aug 2020 16:09:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41836) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k5aZs-0000RK-NF for guix-devel@gnu.org; Tue, 11 Aug 2020 16:08:56 -0400 Received: from sender4-of-o51.zoho.com ([136.143.188.51]:21145) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1k5aZq-0003tw-0W for guix-devel@gnu.org; Tue, 11 Aug 2020 16:08:56 -0400 ARC-Seal: i=1; a=rsa-sha256; t=1597176528; cv=none; d=zohomail.com; s=zohoarc; b=ftL24i10YyBN+suO8vuCWZeFaKkVCBqgbau9WRctM9CDgggCZVPn0BL3p62hZGYircT/Rvda3kaGSqj0Jnt7cpqPNCumE1DmHh2N+oxA/YF9CkAqwaYQPTxq6YTM59htgb560DxP7BAmIniS0rthFYWFCKwvHQpIAABBq4naVDI= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1597176528; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:To; bh=1RgyVDSzmgr4OBq+oa9pcxzbXZ2xtvCNcuziB1xlQYI=; b=X2V7W+3w+0eUfoSs3n5KUhYDRvULqN72hclPpKfkMYGJW0WC98j68DI2fBUmUDYdp5QUtb+s0E/MxS6ttQK9QhvSmcOn3615QZExVLereIpkGCOkUH5oF5CYUPhM/334VighvEZaPZ3TwIdvvgC6rVlLcLuUZz7DVRfaRPN1cg8= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=elephly.net; spf=pass smtp.mailfrom=rekado@elephly.net; dmarc=pass header.from= header.from= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1597176528; s=zoho; d=elephly.net; i=rekado@elephly.net; h=References:From:To:Cc:Subject:In-reply-to:Date:Message-ID:MIME-Version:Content-Type:Content-Transfer-Encoding; bh=1RgyVDSzmgr4OBq+oa9pcxzbXZ2xtvCNcuziB1xlQYI=; b=PNF8wROZnHLT0DGetcFniQtIEjlGd5tNJGImyLGt2LqgFTr+GiwlSvCelB0ErdvI WOQjSKapO4i0NbpNsDYnIBzSIzZPnGjPSieL+tG8LAmaWPPVWXFfLizE5suVm/h0ueH ipCSfSwObbFuWmUQ0sWlf8pmwwnaXxMKUvc0idjU= Received: from localhost (p54ad4b86.dip0.t-ipconnect.de [84.173.75.134]) by mx.zohomail.com with SMTPS id 159717652299835.42644928922118; Tue, 11 Aug 2020 13:08:42 -0700 (PDT) References: <87sgcuh8rb.fsf@ambrevar.xyz> <87y2ml429i.fsf@elephly.net> <87364tgja3.fsf@ambrevar.xyz> <87y2mlf4jw.fsf@ambrevar.xyz> User-agent: mu4e 1.4.10; emacs 26.3 From: Ricardo Wurmus To: Pierre Neidhardt Subject: Re: File search progress: database review and question on triggers In-reply-to: <87y2mlf4jw.fsf@ambrevar.xyz> X-URL: https://elephly.net X-PGP-Key: https://elephly.net/rekado.pubkey X-PGP-Fingerprint: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC Date: Tue, 11 Aug 2020 22:08:39 +0200 Message-ID: <87pn7x3pyw.fsf@elephly.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-ZohoMailClient: External Received-SPF: pass client-ip=136.143.188.51; envelope-from=rekado@elephly.net; helo=sender4-of-o51.zoho.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/08/11 16:08:50 X-ACL-Warn: Detected OS = Linux 3.11 and newer [fuzzy] X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: guix-devel@gnu.org Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=elephly.net header.s=zoho header.b=PNF8wROZ; dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Spam-Score: -2.21 X-TUID: 2T3ZpPcj5hRk Pierre Neidhardt writes: > Pierre Neidhardt writes: > >> Ricardo Wurmus writes: >> >>> I=E2=80=99m not suggesting to use updatedb, but I think it can be instr= uctive to >>> look at how the file database is implemented there. We don=E2=80=99t h= ave to >>> use SQlite if it is much slower and heavier than a custom inverted >>> index. >> >> Good call, I'll benchmark against an inverted index. >> >> Some cost may also be induced by the Guix store queries, not sure if we >> can optimize these. > > With an s-exp based file, or a trivial text-based format, the downside > is that it needs a bit of extra work to only load select entries, > e.g. just the entries matching a specific Guix version. > > Would you happen to know a serialization library that allows for loading > only a select portion of a file? I don=E2=80=99t know of any suitable file format, but a generated offset in= dex at the beginning of the file could be useful. You=E2=80=99d read the first expression and then seek to the specified byte offset (after the position of the index expression) where you then read the target expression. This can easily be generated and it can be extended without having to rewrite the whole file. But perhaps that=E2=80=99s premature optimization. --=20 Ricardo