From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id WNsxMVT2rWPd0AAAbAwnHQ (envelope-from ) for ; Thu, 29 Dec 2022 21:19:32 +0100 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id ICcZMVT2rWPVWQAA9RJhRA (envelope-from ) for ; Thu, 29 Dec 2022 21:19:32 +0100 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 701011D2EA for ; Thu, 29 Dec 2022 21:19:32 +0100 (CET) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pAzMn-0005n9-HK; Thu, 29 Dec 2022 15:19:05 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pAzMl-0005ms-JO for guix-patches@gnu.org; Thu, 29 Dec 2022 15:19:03 -0500 Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pAzMl-0004eK-49 for guix-patches@gnu.org; Thu, 29 Dec 2022 15:19:03 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pAzMk-0007Gt-HF for guix-patches@gnu.org; Thu, 29 Dec 2022 15:19:02 -0500 X-Loop: help-debbugs@gnu.org Subject: [bug#60410] [PATCH 0/7] mumi: Boolean prefixes in xapian indexing and others Resent-From: Arun Isaac Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Thu, 29 Dec 2022 20:19:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 60410 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 60410@debbugs.gnu.org, rekado@elephly.net Cc: Arun Isaac X-Debbugs-Original-To: guix-patches@gnu.org, Ricardo Wurmus Received: via spool by submit@debbugs.gnu.org id=B.167234512527926 (code B ref -1); Thu, 29 Dec 2022 20:19:02 +0000 Received: (at submit) by debbugs.gnu.org; 29 Dec 2022 20:18:45 +0000 Received: from localhost ([127.0.0.1]:32985 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pAzMT-0007GM-5G for submit@debbugs.gnu.org; Thu, 29 Dec 2022 15:18:45 -0500 Received: from lists.gnu.org ([209.51.188.17]:54252) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pAzMR-0007GC-68 for submit@debbugs.gnu.org; Thu, 29 Dec 2022 15:18:43 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pAzMR-0005kN-0p for guix-patches@gnu.org; Thu, 29 Dec 2022 15:18:43 -0500 Received: from mugam.systemreboot.net ([139.59.75.54]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pAzMO-0004d6-2K for guix-patches@gnu.org; Thu, 29 Dec 2022 15:18:42 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=systemreboot.net; s=default; h=Content-Transfer-Encoding:MIME-Version: Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=Svxwvue+mDX1Yys/m0cjnSWxmLI2jK+RD1+75y/rVTE=; b=L1J9+1gVkh7yofd6tVCAtO0KcE 7MGEL4HBb/XeZb04nw44Xa1cd6nXh0xb5w03Tny75y8N4Q9ZVeGVwRI3Baoh9Q6uwY3/piJWRA18U BqpsR0G4heQ+RwM61EgEIhvevRTQS4sDpjH4L5Tr3mBFniFvHekdk696+DopYV2qbbRHlvatd1I/I AdeaT1KUaBI3nuD6Be4wQFHO0Na6fL7fzo9+KVZr6/XHoc/96+cRUeqBPGIGLLn4/MpLB+E7KnBnK fS+P4UfEtk0oKRYRHEmldZCFf3tRYnMJoYOsY49rUgee0ogyg0isr/Ma/rzQ4IAaKCZ4qTHbcn+xB XnybUNmw==; Received: from [192.168.2.1] (port=45786 helo=localhost.localdomain) by systemreboot.net with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1pAzME-000oI2-0R; Fri, 30 Dec 2022 01:48:30 +0530 From: Arun Isaac Date: Thu, 29 Dec 2022 20:18:09 +0000 Message-Id: <20221229201809.27997-1-arunisaac@systemreboot.net> X-Mailer: git-send-email 2.38.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=139.59.75.54; envelope-from=arunisaac@systemreboot.net; helo=mugam.systemreboot.net X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+larch=yhetil.org@gnu.org Sender: guix-patches-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN ARC-Seal: i=1; s=key1; d=yhetil.org; t=1672345172; a=rsa-sha256; cv=none; b=nwRTG1kC82lPkC5wMyx+PWFsgnbx47qDDDLAlYODtJFAoGfBaJL7vzJzB71ytYxhUqzD2+ MOK5v7Z4RIiVV09ub/hcpRysDXd6OMBIwLSZX+3UX+qfcBDwIZA2SZAsGWajC7XZ4nCmVe jKsVGNMRVdPjv5M+1VD0/PS6e5qx0TjJN/Z4k4+IHDveHMMxY4I6oJLuFWyE46khgOdVDO /0XGVWQiLU4yFWEydFz+43IdOLUaF9Fqy3N4M+N1+SLipPvNP02vHAVDAJ5s2IRi+nggSD 0AVa7cufstwRAwSLx7GXIhbUutF6P/VmZmbby9hUK232d38cwUb8FFMzzECx0Q== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=systemreboot.net header.s=default header.b=L1J9+1gV; spf=pass (aspmx1.migadu.com: domain of "guix-patches-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-patches-bounces+larch=yhetil.org@gnu.org"; dmarc=fail reason="SPF not aligned (relaxed)" header.from=systemreboot.net (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1672345172; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding:resent-cc: resent-from:resent-sender:resent-message-id:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=Svxwvue+mDX1Yys/m0cjnSWxmLI2jK+RD1+75y/rVTE=; b=o4rwn3UmshTPlBn2P95Grcsodg1XRrAj5M38X8C49yS3cTJgF6VIffdomyvZsoPYH4vLQf kU3sgKJMf7Y5OSzNnFXDNHyoKaIQhZPIecq9kJSEiZSTovbNauxMUupWH845AsHpnYtncy K2c7xTWWWRzNPFQXYXPvGVHfvjbeFaAVTTmveUn1DXTNx86IVgGPY9yh/Xq3BxxawBD4ia 7j/0joaOaNq/1YT4QNAqtkK7mfbnbIKqO8sbYE049VJhkR6YXMYrEZ9KKaxxdCxDYMoeyM oGJ/Eu0LpcJ1iD6ZN4ZeMOjfbbMjMEtckF3QAa1ILfYMdWepP4TyIZjoj0b3lw== X-Spam-Score: -0.43 X-Migadu-Queue-Id: 701011D2EA Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=systemreboot.net header.s=default header.b=L1J9+1gV; spf=pass (aspmx1.migadu.com: domain of "guix-patches-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-patches-bounces+larch=yhetil.org@gnu.org"; dmarc=fail reason="SPF not aligned (relaxed)" header.from=systemreboot.net (policy=none) X-Migadu-Scanner: scn0.migadu.com X-Migadu-Spam-Score: -0.43 X-TUID: 6giiEMVWLF7R Hi Ricardo, This is a patchset that has been sleeping for some time in my local git repo. So, I thought it was about time to send it over! The main change is that some xapian prefixes should be indexed as boolean prefixes. This makes the use of an implicit AND operator unneccessary and lets xapian do the natural thing of ordering results by relevance. I believe this improves the search significantly. Also, since we retrieve search results by relevance, we can offload limiting of search results to xapian. Thus, we improve performance as well. For this patchset to be useful, mumi's xapian index will have to be rebuilt. In general, it is good to periodically rebuilt the xapian index from scratch. Regards, Arun Arun Isaac (7): xapian: Index several terms as boolean and without positions. xapian: Declare some prefixes as boolean. xapian: Do not override the default OR implicit query operator. messages: Remove unused set intersection feature in search-bugs. messages: Offload limiting search results to xapian. cache: Specify that cache! returns the cached value. xapian: Preserve order of search results. mumi/cache.scm | 3 +- mumi/messages.scm | 29 ++++-------- mumi/xapian.scm | 109 +++++++++++++++++++++++++++++++--------------- 3 files changed, 86 insertions(+), 55 deletions(-) -- 2.38.1