all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Arun Isaac <arunisaac@systemreboot.net>
To: 60410@debbugs.gnu.org, Ricardo Wurmus <rekado@elephly.net>
Cc: Arun Isaac <arunisaac@systemreboot.net>
Subject: [bug#60410] [PATCH 1/7] xapian: Index several terms as boolean and without positions.
Date: Thu, 29 Dec 2022 20:23:54 +0000	[thread overview]
Message-ID: <20221229202400.28565-1-arunisaac@systemreboot.net> (raw)
In-Reply-To: <20221229201809.27997-1-arunisaac@systemreboot.net>

* mumi/xapian.scm (index-files): Index bug number, submitter, authors,
owner, severity, tags, status, file and msgids as boolean terms. Index
bug number, severity, tags, status, file and msgids without position
information.
---
 mumi/xapian.scm | 65 ++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 51 insertions(+), 14 deletions(-)

diff --git a/mumi/xapian.scm b/mumi/xapian.scm
index 68169e8..06a54cd 100644
--- a/mumi/xapian.scm
+++ b/mumi/xapian.scm
@@ -1,6 +1,6 @@
 ;;; mumi -- Mediocre, uh, mail interface
 ;;; Copyright © 2020, 2022 Ricardo Wurmus <rekado@elephly.net>
-;;; Copyright © 2020 Arun Isaac <arunisaac@systemreboot.net>
+;;; Copyright © 2020, 2022 Arun Isaac <arunisaac@systemreboot.net>
 ;;;
 ;;; This program is free software: you can redistribute it and/or
 ;;; modify it under the terms of the GNU Affero General Public License
@@ -119,20 +119,57 @@ messages and index their contents in the Xapian database at DBPATH."
                   (term-generator (make-term-generator #:stem (make-stem "en")
                                                        #:document doc)))
              ;; Index fields with a suitable prefix. This allows for
-             ;; searching separate fields as in subject:foo,
-             ;; from:bar, etc.
-             (index-text! term-generator bugid #:prefix "B")
-             (index-text! term-generator submitter #:prefix "A")
-             (index-text! term-generator authors #:prefix "XA")
+             ;; searching separate fields as in subject:foo, from:bar,
+             ;; etc. We do not keep track of the within document
+             ;; frequencies of terms that will be used for boolean
+             ;; filtering. We do not generate position information for
+             ;; fields that will not need phrase searching or NEAR
+             ;; searches.
+             (index-text! term-generator
+                          bugid
+                          #:prefix "B"
+                          #:wdf-increment 0
+                          #:positions? #f)
+             (index-text! term-generator
+                          submitter
+                          #:prefix "A"
+                          #:wdf-increment 0)
+             (index-text! term-generator
+                          authors
+                          #:prefix "XA"
+                          #:wdf-increment 0)
              (index-text! term-generator subjects #:prefix "S")
-             (index-text! term-generator (or (bug-owner bug) "") #:prefix "XO")
-             (index-text! term-generator (or (bug-severity bug) "normal") #:prefix "XS")
-             (index-text! term-generator (or (bug-tags bug) "") #:prefix "XT")
-             (index-text! term-generator (cond
-                                          ((bug-done bug) "done")
-                                          (else "open")) #:prefix "XSTATUS")
-             (index-text! term-generator file #:prefix "F")
-             (index-text! term-generator msgids #:prefix "XU")
+             (index-text! term-generator
+                          (or (bug-owner bug) "")
+                          #:prefix "XO"
+                          #:wdf-increment 0)
+             (index-text! term-generator
+                          (or (bug-severity bug) "normal")
+                          #:prefix "XS"
+                          #:wdf-increment 0
+                          #:positions? #f)
+             (index-text! term-generator
+                          (or (bug-tags bug) "")
+                          #:prefix "XT"
+                          #:wdf-increment 0
+                          #:positions? #f)
+             (index-text! term-generator
+                          (cond
+                           ((bug-done bug) "done")
+                           (else "open"))
+                          #:prefix "XSTATUS"
+                          #:wdf-increment 0
+                          #:positions? #f)
+             (index-text! term-generator
+                          file
+                          #:prefix "F"
+                          #:wdf-increment 0
+                          #:positions? #f)
+             (index-text! term-generator
+                          msgids
+                          #:prefix "XU"
+                          #:wdf-increment 0
+                          #:positions? #f)
 
              ;; Index subject and body without prefixes for general
              ;; search.
-- 
2.38.1





  reply	other threads:[~2022-12-29 20:25 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-29 20:18 [bug#60410] [PATCH 0/7] mumi: Boolean prefixes in xapian indexing and others Arun Isaac
2022-12-29 20:23 ` Arun Isaac [this message]
2022-12-31 18:09   ` [bug#60410] [PATCH 1/7] xapian: Index several terms as boolean and without positions Ricardo Wurmus
2022-12-31 23:02     ` Arun Isaac
2023-01-01 12:14       ` bug#60410: " Ricardo Wurmus
2022-12-29 20:23 ` [bug#60410] [PATCH 2/7] xapian: Declare some prefixes as boolean Arun Isaac
2023-01-01 23:19   ` Ricardo Wurmus
2023-01-02 17:01     ` Arun Isaac
2022-12-29 20:23 ` [bug#60410] [PATCH 3/7] xapian: Do not override the default OR implicit query operator Arun Isaac
2022-12-29 20:23 ` [bug#60410] [PATCH 4/7] messages: Remove unused set intersection feature in search-bugs Arun Isaac
2022-12-29 20:23 ` [bug#60410] [PATCH 5/7] messages: Offload limiting search results to xapian Arun Isaac
2022-12-29 20:23 ` [bug#60410] [PATCH 6/7] cache: Specify that cache! returns the cached value Arun Isaac
2022-12-29 20:24 ` [bug#60410] [PATCH 7/7] xapian: Preserve order of search results Arun Isaac

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221229202400.28565-1-arunisaac@systemreboot.net \
    --to=arunisaac@systemreboot.net \
    --cc=60410@debbugs.gnu.org \
    --cc=rekado@elephly.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.