From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:470:142:3::10]:48180) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j7et5-0004NA-Iz for guix-patches@gnu.org; Fri, 28 Feb 2020 07:37:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j7et4-0005eU-Ix for guix-patches@gnu.org; Fri, 28 Feb 2020 07:37:03 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:54769) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1j7et4-0005eM-FS for guix-patches@gnu.org; Fri, 28 Feb 2020 07:37:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1j7et4-0002Rc-C5 for guix-patches@gnu.org; Fri, 28 Feb 2020 07:37:02 -0500 Subject: [bug#39258] [PATCH 0/4] Xapian for Guix package search Resent-Message-ID: MIME-Version: 1.0 References: <20200227204150.30985-1-arunisaac@systemreboot.net> In-Reply-To: <20200227204150.30985-1-arunisaac@systemreboot.net> From: zimoun Date: Fri, 28 Feb 2020 13:36:06 +0100 Message-ID: Content-Type: text/plain; charset="UTF-8" List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+kyle=kyleam.com@gnu.org Sender: "Guix-patches" To: Arun Isaac Cc: Ludovic =?UTF-8?Q?Court=C3=A8s?= , Pierre Neidhardt , 39258@debbugs.gnu.org Hi Arun, Really cool! Thank you! On Thu, 27 Feb 2020 at 21:42, Arun Isaac wrote: > * Speed improvement > > Despite search-package-index in gnu/packages.scm taking only around 1.5ms, I > see an overall speedup in `guix search` of only a factor of 2 -- from around > 2s to around 1s. I wonder what else in `guix search` is taking up so much > time. Interesting... maybe an hidden 'fold-packages'? Well, I have not yet looked into your code. > * Currently indexing only the package descriptions > > In this patchset, I have only indexed the package descriptions. In the next > version of this patchset, I will index all other terms as specified in > %package-metrics of guix/ui.scm. Yes, it appears to me a detail that should be easy to fix. I mean, it does not seems blocking. > * Should I add guile-xapian as a propagated input to guix in > gnu/packages/package-management.scm? IMHO, yes. I mean, I guess. :-) > * Drop regexp search support > > In this patchset, I have retained the older regexp search support. But, I > think we should drop it and only have xapian search. In cases where the search > index is not authoritative, we can build an in-memory xapian search index on > the fly and use it to search. This will slow down the search, but will ensure > our search results are consistent and do not depend on the authoritativeness > of the search index. I understand why you have turned off the regexp support. It is not necessary at the first experimentation to see if it is worth the addition or not. So, before investigating how some better regexp could be used with Xapian, let start to benchmark Xapian vs plain 'fold-packages'. > * Commit messages > > Except for patch 1, I am not sure what prefixes (build-self, gnu, etc.) to use > in the first line of the commit message. Some advice there would be helpful. I cannot help. )-: All the best, simon