unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eric Abrahamsen <eric@ericabrahamsen.net>
To: "Jose A. Ortega Ruiz" <jao@gnu.org>
Cc: 44509@debbugs.gnu.org
Subject: bug#44509: 28.0.50; Error querying with new gnus-search and notmuch
Date: Wed, 11 Nov 2020 09:51:09 -0800	[thread overview]
Message-ID: <877dqrhjea.fsf@ericabrahamsen.net> (raw)
In-Reply-To: <875z6cxyue.fsf@gnus.jao.io> (Jose A. Ortega Ruiz's message of "Wed, 11 Nov 2020 05:10:49 +0000")

"Jose A. Ortega Ruiz" <jao@gnu.org> writes:

> Hi again, Eric.
>
> I was looking at the notmuch engine code and wondering why it wasn't
> working for me in a leafnode directory, which puts its messages in a
> nnmail-compatible format (as far as i can tell).  And i discovered what
> looks like a possible bug in gnus-search-indexed (or a misunderstanding
> on my side).  Concretely, on line 1363 of gnus-search, when implementing
> gnus-search-indexed-parsed output, we're constructing a groups regexp
> that looks like:
>
> 	(group-regexp (when groups
> 			(regexp-opt
> 			 (mapcar
> 			  (lambda (x) (gnus-group-real-name x))
> 			  groups))))
>
> and then matching the returned list of files on that regexp (line 1377):
>
>        (string-match-p group-regexp f-name)))
>
> But gnus-group-real-name is giving me back a dot-separated path for
> nested folders (e.g. gmane.bugs.gnus), while the file paths in the
> results are using slashes (gmane/bugs/gnus/234 etc.), so the match
> test always fails and no results are returned.
>
> If i simply redefine group-regexp using:
>
>     (replace-regexp-in-string "\\." "/" (gnus-group-real-name x)))

This code that comes straight from the old nnir.el, and has always
looked a bit fragile to me. The problem is that it really just expects
each message to be in a regular file, with groups as folders.

So you're using notmuch as a search engine for a local leafnode nntp
server, and indexing its message store directly?

Is there any leafnode setting that could influence how it stores its
messages? Can it be convinced to store them in hierarchical folders?

I suppose I could change the group-regexp to munge periods, but that
could cause breakage in other cases, and I would be hesitant to do that.

Otherwise, all the indexed search engines have a
`gnus-search-indexed-extract' method that's used to actually return the
file name from the results buffer. Each one's got something slightly
different. It would be very easy to create a new notmuch search engine
subclass that only overrides this method. To wit:

(defclass gnus-search-leafnode-notmuch (gnus-search-notmuch))

(cl-defmethod gnus-search-indexed-extract ((_engine gnus-search-leafnode-notmuch))
  (let ((results-string (buffer-substring-no-properties
			 (line-beginning-position)
			 (line-end-position))))
    (prog1
	(list (replace-regexp-in-string
	       "periods" "forward slashes" results-string)
	      100)
      (forward-line))))

The `replace-regexp-in-string' changes the periods into forward
slashes. Then in your Gnus config:

'(nntp "your-local-leafnode"
   (gnus-search-engine gnus-search-leafnode-notmuch))

This is what I would do (I haven't tested the above, it might require
some tweaking). It takes advantage of all the benefits of the
generic-method approach, and lets you change behavior without messing
with the rest of the gnus-search code. You might also find it helpful to
tweak a few other slots or methods for this engine in particular.

> On other news, i was trying to find a way in Gnus to go from Message-ID
> to article no. for IMAP groups or nnmaildirs (which would make using
> notmuch with dovecot really trivial), but without luck: anyone knows of
> an easy way?

Come to think of it, nnimap can already accept article numbers
as message-ids. So if notmuch returns its results as message-ids, it
should work transparently.

The problem is that while the mechanism is there, it works by searching
each message-id and getting the proper article number that way. My guess
is that you'd be negating most of the speed advantage of using notmuch
like this, and you'd still be better off using dovecot's own full-text
indexing. You don't have to use xapian!

https://doc.dovecot.org/configuration_manual/fts/

BUT, if you really wanted to do this, it would be relatively easy to
check for "--output=messages" in 'switches, and use that instead of
"--output=files".

Eric





  reply	other threads:[~2020-11-11 17:51 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-08  1:44 bug#44509: 28.0.50; Error querying with new gnus-search and notmuch Jose A. Ortega Ruiz
2020-11-08  2:23 ` Eric Abrahamsen
2020-11-08  2:49   ` Jose A. Ortega Ruiz
2020-11-08  4:53     ` Eric Abrahamsen
2020-11-08  7:09       ` Jose A. Ortega Ruiz
2020-11-11  5:10         ` Jose A. Ortega Ruiz
     [not found]           ` <uMxcF8JPFQWri9-dIaeTgjGDP3ACtDZ1GqktYJqUzsD2Yh3axLc8KfS50JxW0gyoZYNJsPwSC_j0rx-q8W9AVw==@protonmail.internalid>
2020-11-11 17:51           ` Eric Abrahamsen [this message]
2020-11-11 17:56             ` Eric Abrahamsen
2020-11-11 18:29             ` jao
2020-11-11 19:11               ` Eric Abrahamsen
2020-11-11 21:50                 ` jao
2020-11-12  0:29                   ` Eric Abrahamsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877dqrhjea.fsf@ericabrahamsen.net \
    --to=eric@ericabrahamsen.net \
    --cc=44509@debbugs.gnu.org \
    --cc=jao@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).