From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pierre Neidhardt Subject: Re: Inverted index to accelerate guix package search Date: Wed, 15 Jan 2020 10:06:13 +0100 Message-ID: <878sm9w0l6.fsf@ambrevar.xyz> References: <87a76r68u6.fsf@ambrevar.xyz> Mime-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:51048) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ired2-0001Kr-1R for guix-devel@gnu.org; Wed, 15 Jan 2020 04:06:21 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ired0-00071M-RQ for guix-devel@gnu.org; Wed, 15 Jan 2020 04:06:19 -0500 Received: from relay8-d.mail.gandi.net ([217.70.183.201]:41849) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ired0-000711-L1 for guix-devel@gnu.org; Wed, 15 Jan 2020 04:06:18 -0500 In-Reply-To: List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane-mx.org@gnu.org Sender: "Guix-devel" To: Arun Isaac , Bengt Richter , zimoun Cc: Guix Devel --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Arun Isaac writes: > I feel xapian is too much work (considering that we don't yet have guile > bindings) compared to our own simple implementation of an inverted > index. But, of course, I am biased since I wrote the inverted index > code! :-) Indeed, xapian bindings would be the biggest obstacle. > But, on a more serious note, if we move to xapian, we will not be able > to support regular expression based search queries that we support > today. We can always keep our current regexp search (which is trivial) for those who really want it. I believe that Xapian is much more usable than regexps on a daily basis. > On the other hand, I can extend the inverted index implementation > to support regular expression searches. Personally, I don't use regular > expression based search queries, and don't think they are very useful > especially if we make use of xapian's stemming. What do people think? Agreed! I see this with my emails (Notmuch): I type whatever words I remember and whoever names was involved in a thread and I systematically find it. I've used it for months and it never missed! :) > On the question of whether xapian is too heavy, I think we should make > it an optional dependency of Guix so that it remains possible to build > and use a more minimalistic Guix for those interested in such things. I suppose it wouldn't be too hard to make it optional. That said, with this little overhead and this much benefit, it seems to be a very nice default at first glance. Cheers! =2D-=20 Pierre Neidhardt https://ambrevar.xyz/ --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEUPM+LlsMPZAEJKvom9z0l6S7zH8FAl4e1gUACgkQm9z0l6S7 zH85swf9Hx5d+6tWiLJmVUAUc22FqmoNBx1DWZfVGTkir6Juh2oKt00ICdqc7qq7 3TrY2iYTy1QKDQNGBg/EDfj9YzFfWHJSh15KjbYGsKWYJ6pz/ltC3jMams6RYXrP LE/JBduSThjsemW3QmkjoeK5TBDogmjMD8hkkqGdLo1PfOPj4U9F12GNnoWkCaAx IZCVE0L6yuo8Ib0ocQtJiqowtwCEygseHWpOu1o6Sa1QC8w+HrEG8pV0xx1j2VnU I9qzIeff6Wropw1fRaOyKRsqj5l1ZOxw+AGYpNJugkG+3ERz7IY0z+j1Nw+1OMFs dz4ZZj4+RB6OGhw0f2266+zzqKVBoA== =lcNm -----END PGP SIGNATURE----- --=-=-=--