From mboxrd@z Thu Jan 1 00:00:00 1970 From: zimoun Subject: Re: Inverted index to accelerate guix package search Date: Wed, 15 Jan 2020 13:00:40 +0100 Message-ID: References: <87a76r68u6.fsf@ambrevar.xyz> <878sm9w0l6.fsf@ambrevar.xyz> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:44530) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1irhM1-0001oj-PR for guix-devel@gnu.org; Wed, 15 Jan 2020 07:01:00 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1irhLw-0003Di-KE for guix-devel@gnu.org; Wed, 15 Jan 2020 07:00:57 -0500 Received: from mail-qk1-x72b.google.com ([2607:f8b0:4864:20::72b]:33745) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1irhLw-0003DQ-Ft for guix-devel@gnu.org; Wed, 15 Jan 2020 07:00:52 -0500 Received: by mail-qk1-x72b.google.com with SMTP id d71so15384072qkc.0 for ; Wed, 15 Jan 2020 04:00:52 -0800 (PST) In-Reply-To: <878sm9w0l6.fsf@ambrevar.xyz> List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane-mx.org@gnu.org Sender: "Guix-devel" To: Pierre Neidhardt Cc: Guix Devel On Wed, 15 Jan 2020 at 10:06, Pierre Neidhardt wrote: > We can always keep our current regexp search (which is trivial) for > those who really want it. I believe that Xapian is much more usable > than regexps on a daily basis. What do you mean by trivial? The command "guix search" supports the full Guile regexp engine, if I remember well. > > On the other hand, I can extend the inverted index implementation > > to support regular expression searches. Personally, I don't use regular > > expression based search queries, and don't think they are very useful > > especially if we make use of xapian's stemming. What do people think? > > Agreed! > > I see this with my emails (Notmuch): I type whatever words I remember > and whoever names was involved in a thread and I systematically find > it. I've used it for months and it never missed! :) The key point here is the scoring. Type whatever words you remember in the GNU mailing search engine. https://lists.gnu.org/archive/cgi-bin/namazu.cgi?query=load-path&submit=Search%21&idxname=guix-devel&max=20&result=normal&sort=score All the best, simon