From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id KKCYOn4id2TnHQEASxT56A (envelope-from ) for ; Wed, 31 May 2023 12:33:35 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id sHl4On4id2Q9CgAA9RJhRA (envelope-from ) for ; Wed, 31 May 2023 12:33:34 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 7A9E638AB8 for ; Wed, 31 May 2023 12:33:34 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1q4J8e-00078C-0C; Wed, 31 May 2023 06:33:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1q4J8a-00077s-TX for guix-devel@gnu.org; Wed, 31 May 2023 06:33:05 -0400 Received: from mail-wm1-x32e.google.com ([2a00:1450:4864:20::32e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1q4J8Y-0003xa-Vs; Wed, 31 May 2023 06:33:04 -0400 Received: by mail-wm1-x32e.google.com with SMTP id 5b1f17b1804b1-3f705976afbso6944065e9.1; Wed, 31 May 2023 03:33:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1685529179; x=1688121179; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:from:to:cc:subject:date:message-id :reply-to; bh=y+aKx1Af5cpxOJbehra/ZtcnhDngHAhPE2zhgWWeq6A=; b=niD+fF/K5XHo73Yuh4oWs0LJxCMbpIkaX7uzeA3gidT0+iqGq0WTAa1NVp9KY7wdnA WWEITi+nAW/JkXxOthnB05EyOqxmZuIQIah9cCD5fw09kmDwgK3VY0hC3RUUBsUj5LmI ZBpzH+JxD1s9B/4dYqbg53Qt9yybwVOzgWdnmRfraWhG/UH2Ohq/ABzETqNzGSR5rdHV 1E4X4j7Nz5iZp+wPA26Q2+YfXIxDnfV7Kzq3xCQXd+F3lhbd0hb6r3sksaLr/7UgZA8u CnGL5jylhv30g93nGA+CXGdjQP4vz/8RypDqb8wRKdXrcIF0OB+2f9eNLsEzsFInxZNf Q3aQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685529179; x=1688121179; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=y+aKx1Af5cpxOJbehra/ZtcnhDngHAhPE2zhgWWeq6A=; b=UaPmiPpnMR3I4NNmq4UnuhO2o+d0bBijHmINQWMXI1mZu2ZaoCq8aFUT22f5/OuM0r qe1E+sPa9tRTwVtc7G83xoNMrLpjfY3Dt4eRyp2k5c6fvGgBgw0m7Wd22GLseH26y0pL yILjN3tqlo8L5aGuKAAwgX1g/pZErb8Q+sdDQEQ7N2h9pG8OxKJyZSCUdxyzgWZ7uF1u c2Zc2APzzICGLpeOMbJcj8Wf6+h8vBvhMEB7gysQfh58v24Lz8l14EhoAoPhuCxzv+8T vfBY1pvTCKtVuHCkEdhBzfoIWf1W6BFZn366jygi1/siKp4AHjQ8Fk+mnSr90JNvtdfP Dflg== X-Gm-Message-State: AC+VfDzA9FNNgdHK1i3YJRayydLE/BiNfIlRX3QG3Nke81pdC6OJMHXL roNYMyO98RIoSDhHWdONkODajATjBbw= X-Google-Smtp-Source: ACHHUZ42R2zHddCxxu4vt02778mFIHOwrKIcWhIme9W5Fqfo30QPsUAJaaVGF63kr/bDQoy6rJbBvQ== X-Received: by 2002:a05:600c:1c86:b0:3f6:d8f:63a8 with SMTP id k6-20020a05600c1c8600b003f60d8f63a8mr1728884wms.0.1685529179429; Wed, 31 May 2023 03:32:59 -0700 (PDT) Received: from lili ([2a01:e0a:59b:9120:65d2:2476:f637:db1e]) by smtp.gmail.com with ESMTPSA id p13-20020a5d638d000000b0030ae3a6be5bsm6325575wru.78.2023.05.31.03.32.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 May 2023 03:32:59 -0700 (PDT) From: Simon Tournier To: Csepp Cc: Ludovic =?utf-8?Q?Court=C3=A8s?= , Andreas Enge , guix-devel@gnu.org Subject: Faster =?utf-8?Q?=E2=80=9Cguix_search=E2=80=9D?= (was Re: How many bytes do we add (closure of guix) when adding one new package?) In-Reply-To: <87r0qxvd9q.fsf@riseup.net> References: <875y9jzl9m.fsf@gnu.org> <874jot19fd.fsf_-_@gnu.org> <87fs7rvv5s.fsf_-_@gnu.org> <878rddooy4.fsf@gnu.org> <87r0r4uv4x.fsf@gmail.com> <87ttvzhxm9.fsf@gnu.org> <87cz2it3yk.fsf@gmail.com> <87r0qxvd9q.fsf@riseup.net> Date: Wed, 31 May 2023 10:05:15 +0200 Message-ID: <86y1l5oric.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2a00:1450:4864:20::32e; envelope-from=zimon.toutoune@gmail.com; helo=mail-wm1-x32e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN ARC-Seal: i=1; s=key1; d=yhetil.org; t=1685529214; a=rsa-sha256; cv=none; b=kl7aEFoVlKuuR+3IcRn3mJcPDgzp5ygu6lf8OY62SGH9470I3qD25IoqxHw6++f5q6e3ug clqpn5SteZ58zGOtPW4+qkFChEhNTjMQAJV0C4VCU0aawLpwbL9A3nx6vs6QwLwZvWC+In coBXei7tyx+efmwtnX7fKjlPtS6m+ajNwqvK+vxDRc4q12q9YmKHyyp57ytXbXbu7JvWje p4aWCjfon9B/2IFA7RDMb5UvdXbQ8ai4PeKb8HEyOTc9+EzjOp6Kx/LoxZJyFRdcvKH0Vg FZR3x8SrLWlRrAFlXCfj0yzTMA2ukGSa0LMuij/XAGcRVpHPPsobNrYmYk2elw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="niD+fF/K"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1685529214; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=y+aKx1Af5cpxOJbehra/ZtcnhDngHAhPE2zhgWWeq6A=; b=rZyRh4RPaBMhoHYSypjttZM40XxdDu6COBi377ZR1++qUt6n/dEXglXFWgmuNlUZG7QYfo EI9qGlNW/02QdBrpttQDZw9zA/rEE9JKx8sp3rBrKLsQCs39f+q0ja5nQY9+X+ay47wqbK AvzXOnNsJ/YB2Yqk67F+TbCRY3OANop1NVbOx8Ej6cEDtkQTsV4waWE+ix8lDQe5TWnDqr iiB4ircs8hVQf9BivDh4QGpahIeSNhwcK9AM7b8THRts0x91OAuYtEg4YVJD49JFrpusWf n83t0pzxjUWyFk+Ex+BFdbwVWr1u8/rCgycd81uVgeiWJcyWAfOjwx40N2TPkA== Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="niD+fF/K"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Scanner: scn0.migadu.com X-Migadu-Spam-Score: -11.11 X-Spam-Score: -11.11 X-Migadu-Queue-Id: 7A9E638AB8 X-TUID: NXoKgdU0btbQ Hi, On Tue, 30 May 2023 at 21:10, Csepp wrote: > It makes zero sense to load full package definitions from > disk for most queries, such as guix search, with an SoA representation > we could load only the fields that we care about. That=E2=80=99s already the case; see ~/.config/guix/current/lib/guix/package.cache. For instance, =E2=80=9Cguix package -A=E2=80=9D exploits it and the perform= ances are acceptable. Two past summers, wow already! I tried to augment it and exploit it for =E2=80=9Cguix search=E2=80=9D. The implementation and bench= mark is in #39258 [1]. Well, the whole thread of #39258 appears to me worth to consider because it spots various bottleneck specific to =E2=80=9Cguix sear= ch=E2=80=9D and explains why the improvement is not straightforward. Well, I have started months ago to write a Guix extension using guile-xapian. My aim is to tackle two annoyances: 1. the speed and 2. the relevance. About the relevance #2, the issue is that the current scoring considers only the local information of one package without considering the global information of all the others. Well, see [2,3,4] for some details. :-) 1: https://issues.guix.gnu.org/39258#119 2: https://yhetil.org/guix/CAJ3okZ3E3bhZ5pROZS68wEKdKOcZ8SpXsvdi-bnB=3D9Jz3= mPahA@mail.gmail.com 3: https://yhetil.org/guix/CAJ3okZ3+hn0nJP98OhnZYLWJvhLGpdTUK+jB0hoM5JArQxO= =3Dzw@mail.gmail.com 4: https://yhetil.org/guix/CAJ3okZ0LaJzWDBA7bjqZew_jAmtt1rj9PJhevwrtBiA_COX= ENg@mail.gmail.com > ps.: Now I'm even more glad that I'm using a file system with > transparent compression on all my Guix systems. Did you benchmarked the performances for some Guix operations on these compressed vs uncompressed file system? Cheers, simon