* [RFC] Gnus generalized search, part II
@ 2017-04-21 21:35 Eric Abrahamsen
2017-04-22 0:16 ` Andrew Cohen
` (2 more replies)
0 siblings, 3 replies; 24+ messages in thread
From: Eric Abrahamsen @ 2017-04-21 21:35 UTC (permalink / raw)
To: emacs-devel; +Cc: ding
[-- Attachment #1: Type: text/plain, Size: 5782 bytes --]
I've been working on a generalized searching for Gnus, where a single
query language gets translated into different search-engine-appropriate
strings. This allows searching multiple backends at once. It's more or
less working, and I'm attaching the new version of the nnir.el file here
which can be used as a drop-in replacement for the existing file.
Ideally, if accepted, this would get rebased on top of Andy Cohen's
reworking of nnir/nnselect.
How it works:
The query entered by the user is parsed into a sexp structure, and then
each engine is responsible for interpreting that.
For instance, you mark one IMAP group, and one maildir group (indexed
with notmuch). Then you enter a query:
"from:john after:1w or -mark:!"
Internally, this becomes:
((from . "john") (or (since 14 4 2017) (not (mark . "flag"))))
The imap engine turns that into:
"FROM john OR SINCE 14-Apr-2017 UNFLAGGED"
And the notmuch engine turns it into:
"from:john date:4/14/2017.. or not tag:flag"
Results from both servers are put in the same summary buffer.
That's pretty much it, I hope people will be interested in this. I've
started writing tests, and will do documentation if this is accepted.
I've pasted the complete docstring of nnir-search-parse-query below.
---------------------
Notes for the curious:
The search engines are now implemented as classes. This allowed for
factoring out a bunch of common code.
I nearly set this up for running multiple searches each in their own
thread, allowing for limited search concurrency. I backed off at the
last minute because of weird IMAP behavior, but the code is pretty much
set up for threads, if IMAP can get sorted out.
I re-implemented a limited version of the IMAP LITERAL+ code I wrote
years ago. If the server advertises support, searches for non-ASCII
strings will make use the LITERAL+ mechanism. ¡¡Turning this on enforces
CHARSET UTF-8!! Ie, the assumption is that if a server can handle
LITERAL+, it can handle CHARSET UTF-8. This is probably totally wrong,
but it would be easy to shut off, or fix if I can figure out how to
DTRT.
So far as I can tell, Hyrex and Swish-e are defunct. They're still in
there, but their search transformation is lacking because there are no
good docs.
Namazu docs are also lacking: they give the examples of searching on
"message-id", "from", and "subject" headers, but are there more? I don't
know. I can't test because mknmz errors on my machine.
Things I'd like to add:
1. Support for IMAP MULTISEARCH and FUZZY
2. A command to automatically update all engine indexes.
3. Regular expression searches for engines that support them.
4. Engines for lucene, solr, raw xapian, sphinx... What else are people
using? There's a base class for locally-indexed search engines, so
these should be easy to add.
5. Create an offline index of gmane messages, to be updated monthly. The
gmane search engine would search locally but request remotely (only
partly joking).
------------------------------
nnir-search-parse-query is a Lisp closure.
(nnir-search-parse-query STRING)
Turn STRING into an s-expression based query.
The resulting query structure is passed to the various search
backends, each of which adapts it as needed.
The search "language" is essentially a series of key:value
expressions. Key is most often a mail header, but there are
other keys. Value is a string, quoted if it contains spaces.
Key and value are separated by a colon, no space. Expressions
are implictly ANDed; the "or" keyword can be used to
OR. "not" will negate the following expression, or keys can be
prefixed with a "-". The "near" operator will work for
engines that understand it; other engines will convert it to
"or". Parenthetical groups work as expected.
A key that matches the name of a mail header will search that
header.
Search keys can be abbreviated so long as they remain
unambiguous, ie "f" will search the "from" header. "s" will raise an
error.
Other keys:
"address" will search all sender and recipient headers.
"recipient" will search "To", "Cc", and "Bcc".
"before" will search messages sent before the specified
date (date specifications to come later). Date is exclusive.
"after" (or its synonym "since") will search messages sent
after the specified date. Date is inclusive.
"mark" will search messages that have some sort of mark.
Likely values include "flag", "seen", "read", "replied".
It’s also possible to use Gnus’ internal marks, ie "mark:R"
will be interpreted as mark:read.
"tag" will search tags -- right now that’s translated to
"keyword" in IMAP, and left as "tag" for notmuch. At some
point this should also be used to search marks in the Gnus
registry.
"contact" will search messages to/from a contact. Contact
management packages must push a function onto
‘nnir-search-contact-sources’, the docstring of which see, for
this to work.
"contact-from" does what you’d expect.
"contact-to" searches the same headers as "recipient".
Other keys can be specified, provided that the search backends
know how to interpret them.
Date values (any key in ‘nnir-search-date-keys’) can be provided
in any format that ‘parse-time-string’ can parse (note that this
can produce weird results). Dates with missing bits will be
interpreted as the most recent occurance thereof (ie "march 03"
is the most recent March 3rd). Lastly, relative specifications
such as 1d (one day ago) are understood. This also accepts w, m,
and y. m is assumed to be 30 days.
This function will accept pretty much anything as input. Its only job is
to parse the query into a sexp, and pass that on -- it is the job of the
search backends to make sense of the structured query. Malformed,
unusable or invalid queries will typically be silently ignored.
[-- Attachment #2: nnir.el --]
[-- Type: application/emacs-lisp, Size: 89882 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-21 21:35 [RFC] Gnus generalized search, part II Eric Abrahamsen
@ 2017-04-22 0:16 ` Andrew Cohen
2017-04-22 5:27 ` Eric Abrahamsen
2017-04-22 7:50 ` Eli Zaretskii
2017-04-22 19:53 ` Lars Ingebrigtsen
2017-04-23 13:48 ` Dan Christensen
2 siblings, 2 replies; 24+ messages in thread
From: Andrew Cohen @ 2017-04-22 0:16 UTC (permalink / raw)
To: emacs-devel; +Cc: ding
Dear Eric:
This is great (although I haven't tested it yet:))
Some questions/comments:
The previous version of nnir that I contributed could search multiple
backends (indeed something I did routinely with messages threads that
were spread out across gmane and my personal email which in turn was a
combination of imap and namazu) but it was cumbersome since it worked
only with queries that were understood simultaneously by both
backends. This effectively reduced the searches to rather simple ones,
but it was nevertheless pretty effective. One aspect of the design that
was important was that C-g while searching wouldn't abort the whole
thing but just the current backend. So if the gmane search web site
wasn't responding, for example, a C-g would move things along and still
search the other backends. Can you make sure that your modifications do
the same? (They probably do, but I just want to make sure:))
I occasionally do some complicated searching which needs access to the
raw imap syntax (and indeed the imap thread referral uses the ability of
nnir-run-query to accept a raw imap search constructed in
'nnimap-make-thread-query) so I think it is important for nnir-run-query
to accept this format as well. I know I previously said it would be good
if the whole criteria thing goes away, but perhaps I was too
hasty. Right now a prefix arg to the gnus entry point to searching sets
the criteria in a complicated fashion; we could simplify this so that a
prefix arg skips all parsing of the query-spec and sends the raw query
directly to the query engine (as is the case now if you use a criterion
of 'imap). The user would presumably not try this with multiple backends
since the syntax wouldn't be common.
My limited window to work on this is rapidly closing. The good news is
that I think that all of the nnselect stuff is done (aside from some
renaming of things) and seems to be working. The bad news is that after
weeks of trying I still have no git access :( I have sent 2 emails to
this list, and on the advice of several people I have signed up for a
savannah account and requested membership in emacs, but its been almost
a week with no response.
The changes are pretty large and somewhat invasive, and since there are
likely to be bugs found in testing I expect some back and forth with
testers and subsequent modifications in the code. I doubt I have the
wherewithal to do this without using git. So if I can't get access soon
I may have to forgo pushing these changes.
Best,
Andy
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-22 0:16 ` Andrew Cohen
@ 2017-04-22 5:27 ` Eric Abrahamsen
2017-04-22 8:08 ` Eli Zaretskii
2017-04-22 7:50 ` Eli Zaretskii
1 sibling, 1 reply; 24+ messages in thread
From: Eric Abrahamsen @ 2017-04-22 5:27 UTC (permalink / raw)
To: emacs-devel
Andrew Cohen <cohen@bu.edu> writes:
> Dear Eric:
>
> This is great (although I haven't tested it yet:))
>
> Some questions/comments:
>
> The previous version of nnir that I contributed could search multiple
> backends (indeed something I did routinely with messages threads that
> were spread out across gmane and my personal email which in turn was a
> combination of imap and namazu) but it was cumbersome since it worked
> only with queries that were understood simultaneously by both
> backends. This effectively reduced the searches to rather simple ones,
> but it was nevertheless pretty effective. One aspect of the design that
> was important was that C-g while searching wouldn't abort the whole
> thing but just the current backend. So if the gmane search web site
> wasn't responding, for example, a C-g would move things along and still
> search the other backends. Can you make sure that your modifications do
> the same? (They probably do, but I just want to make sure:))
Interesting, I hadn't thought of that. Right now `nnir-run-query' looks
more or less like it did, the only difference with the indexed searches
is that instead of `call-process' they run `start-process' and
`accept-process-output'. It's possible that will react differently to
quitting, I'll try it and make sure it still behaves the same.
If I can get threading working, things may change again: I don't know
how threads interact with C-g.
> I occasionally do some complicated searching which needs access to the
> raw imap syntax (and indeed the imap thread referral uses the ability of
> nnir-run-query to accept a raw imap search constructed in
> 'nnimap-make-thread-query) so I think it is important for nnir-run-query
> to accept this format as well. I know I previously said it would be good
> if the whole criteria thing goes away, but perhaps I was too
> hasty. Right now a prefix arg to the gnus entry point to searching sets
> the criteria in a complicated fashion; we could simplify this so that a
> prefix arg skips all parsing of the query-spec and sends the raw query
> directly to the query engine (as is the case now if you use a criterion
> of 'imap). The user would presumably not try this with multiple backends
> since the syntax wouldn't be common.
Yes, I was thinking something similar -- users ought to be able to drop
to raw strings if they need to. That shouldn't be hard to implement at
all.
(That said, I think the query parsing can handle anything IMAP can
handle -- I'd like to know if it can't!)
> My limited window to work on this is rapidly closing. The good news is
> that I think that all of the nnselect stuff is done (aside from some
> renaming of things) and seems to be working. The bad news is that after
> weeks of trying I still have no git access :( I have sent 2 emails to
> this list, and on the advice of several people I have signed up for a
> savannah account and requested membership in emacs, but its been almost
> a week with no response.
>
> The changes are pretty large and somewhat invasive, and since there are
> likely to be bugs found in testing I expect some back and forth with
> testers and subsequent modifications in the code. I doubt I have the
> wherewithal to do this without using git. So if I can't get access soon
> I may have to forgo pushing these changes.
That would be unfortunate. Here's hoping that gets sorted soon.
Eric
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-22 5:27 ` Eric Abrahamsen
@ 2017-04-22 8:08 ` Eli Zaretskii
2017-04-22 15:08 ` Eric Abrahamsen
0 siblings, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2017-04-22 8:08 UTC (permalink / raw)
To: Eric Abrahamsen; +Cc: emacs-devel
> From: Eric Abrahamsen <eric@ericabrahamsen.net>
> Date: Fri, 21 Apr 2017 22:27:45 -0700
>
> I don't know how threads interact with C-g.
How would you want threads to interact with C-g?
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-22 8:08 ` Eli Zaretskii
@ 2017-04-22 15:08 ` Eric Abrahamsen
2017-04-22 15:17 ` Eli Zaretskii
2017-04-22 16:00 ` Noam Postavsky
0 siblings, 2 replies; 24+ messages in thread
From: Eric Abrahamsen @ 2017-04-22 15:08 UTC (permalink / raw)
To: emacs-devel
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Eric Abrahamsen <eric@ericabrahamsen.net>
>> Date: Fri, 21 Apr 2017 22:27:45 -0700
>>
>> I don't know how threads interact with C-g.
>
> How would you want threads to interact with C-g?
I'm still trying to get a correct mental model of how all this is
working. I assume that, if I gather the threads using:
(mapc #'thread-join threads)
None of the threads ever become the "current thread", and so C-g would
only ever signal quit to the main thread. So maybe instead of mapc, we
do:
(dolist (t threads)
(condition-case nil
(thread-join t)
(quit (thread-signal t 'quit))))
According to my (limited, untested) understanding, that ought to do the
right thing.
While we're here, one more quick question. Gnus's IMAP servers collect
output by running this on a loop, until there's no more output:
(accept-process-output
(truncate nntp-read-timeout))
nntp-read-timeout is 0.1, and truncate turns that into 0. Is
(accept-process-output 0) the same as (accept-process-output nil)?
Thanks,
Eric
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-22 15:08 ` Eric Abrahamsen
@ 2017-04-22 15:17 ` Eli Zaretskii
2017-04-22 15:25 ` Eli Zaretskii
2017-04-22 19:25 ` Eric Abrahamsen
2017-04-22 16:00 ` Noam Postavsky
1 sibling, 2 replies; 24+ messages in thread
From: Eli Zaretskii @ 2017-04-22 15:17 UTC (permalink / raw)
To: Eric Abrahamsen; +Cc: emacs-devel
> From: Eric Abrahamsen <eric@ericabrahamsen.net>
> Date: Sat, 22 Apr 2017 08:08:05 -0700
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> >> From: Eric Abrahamsen <eric@ericabrahamsen.net>
> >> Date: Fri, 21 Apr 2017 22:27:45 -0700
> >>
> >> I don't know how threads interact with C-g.
> >
> > How would you want threads to interact with C-g?
>
> I'm still trying to get a correct mental model of how all this is
> working. I assume that, if I gather the threads using:
>
> (mapc #'thread-join threads)
>
> None of the threads ever become the "current thread", and so C-g would
> only ever signal quit to the main thread. So maybe instead of mapc, we
> do:
>
> (dolist (t threads)
> (condition-case nil
> (thread-join t)
> (quit (thread-signal t 'quit))))
>
> According to my (limited, untested) understanding, that ought to do the
> right thing.
But what _is_ the right thing?
I asked the question because I really would like to know what would
you want/expect to be the effect of C-g on the active threads? It's
not a rhetoric question. Can you please humor me?
> Is (accept-process-output 0) the same as (accept-process-output
> nil)?
Yes.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-22 15:17 ` Eli Zaretskii
@ 2017-04-22 15:25 ` Eli Zaretskii
2017-04-22 19:25 ` Eric Abrahamsen
1 sibling, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2017-04-22 15:25 UTC (permalink / raw)
To: eric; +Cc: emacs-devel
> Date: Sat, 22 Apr 2017 18:17:47 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
>
> > Is (accept-process-output 0) the same as (accept-process-output
> > nil)?
>
> Yes.
Sorry, I've misread the code. Zero means don't wait at all, even if
process output is not available.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-22 15:17 ` Eli Zaretskii
2017-04-22 15:25 ` Eli Zaretskii
@ 2017-04-22 19:25 ` Eric Abrahamsen
2017-04-22 20:06 ` Eli Zaretskii
2017-04-24 17:17 ` Stephen Leake
1 sibling, 2 replies; 24+ messages in thread
From: Eric Abrahamsen @ 2017-04-22 19:25 UTC (permalink / raw)
To: emacs-devel
Noam Postavsky <npostavs@users.sourceforge.net> writes:
> On Sat, Apr 22, 2017 at 11:08 AM, Eric Abrahamsen
> <eric@ericabrahamsen.net> wrote:
>> I'm still trying to get a correct mental model of how all this is
>> working. I assume that, if I gather the threads using:
>>
>> (mapc #'thread-join threads)
>>
>> None of the threads ever become the "current thread",
>
> Wouldn't thead-join stop the current thread, and then whichever thread
> runs next (probably one of the ones in `threads') would become the
> current thread?
Ah, of course! Yeesh, this is taking a while to wrap my brain around.
Eli Zaretskii <eliz@gnu.org> writes:
> But what _is_ the right thing?
>
> I asked the question because I really would like to know what would
> you want/expect to be the effect of C-g on the active threads? It's
> not a rhetoric question. Can you please humor me?
Okay! Sorry... Basically we're sending search queries to multiple
servers, and using threads to make the external processes asynchronous.
C-g would come into play when one or more of those processes hangs or is
slow, and the user loses patience and wants to quit. The desired result
would be that whichever thread we're currently waiting on gets killed,
and the other threads continue. Ideally there would be a message noting
which search process was abandoned, which is another reason to use
condition-case.
Ugh, this is hurting my brain. Here's the code skeleton:
(let* ((results "")
(threads
(mapcar
(lambda (thing)
(make-thread
(lambda ()
(let ((proc (start-process name buf thing-program)))
(accept-process-output proc)
(with-current-buffer buf
(setq results
(concat (buffer-string) results)))))))
'(thing1 thing2))))
(mapc #'thread-join threads))
accept-process-output is given no timeout. So when we hit the first
`thread-join', we wait for the first accept-process-output to return
completely, putting all its output in its process buffer. While it's
doing that, output from the second and third thread processes is also
arriving on the wire, but it's being buffered in C code or in the
process itself or in some other special non-Lisp place (I'm making this
part up, I have no idea). Assuming the three processes take the same
amount of time, the second and third `thread-join's should finish very
quickly, because all they have to do is dump their output from wherever
it's being held into their respective process buffers, and then concat
the buffer strings into the result.
Is that actually what happens?
I'm trying to think about what would happen if we looped the
`accept-process-output' on say a half-second timeout. When the first
`thread-join' is called, does it mean all three processes would start
getting half-second opportunities to write process output into their
output buffers? Or would the second and third threads not get to do
their `accept-process-output' calls at all until they were joined?
Here's where I get confused.
Realistically, the user would be unlikely to quit unless one of the
processes was taking a very long time, at which point that would be the
only running thread, and probably the right thing would happen.
Eli Zaretskii <eliz@gnu.org> writes:
>> Date: Sat, 22 Apr 2017 18:17:47 +0300
>> From: Eli Zaretskii <eliz@gnu.org>
>> Cc: emacs-devel@gnu.org
>>
>> > Is (accept-process-output 0) the same as (accept-process-output
>> > nil)?
>>
>> Yes.
>
> Sorry, I've misread the code. Zero means don't wait at all, even if
> process output is not available.
Good to know, thanks. Actually, in a second I'll do a patch to add that
to the docstring, that's useful information.
Eric
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-22 19:25 ` Eric Abrahamsen
@ 2017-04-22 20:06 ` Eli Zaretskii
2017-04-22 22:50 ` Eric Abrahamsen
2017-04-24 17:17 ` Stephen Leake
1 sibling, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2017-04-22 20:06 UTC (permalink / raw)
To: Eric Abrahamsen; +Cc: emacs-devel
> From: Eric Abrahamsen <eric@ericabrahamsen.net>
> Date: Sat, 22 Apr 2017 12:25:12 -0700
>
> > But what _is_ the right thing?
> >
> > I asked the question because I really would like to know what would
> > you want/expect to be the effect of C-g on the active threads? It's
> > not a rhetoric question. Can you please humor me?
>
> Okay! Sorry... Basically we're sending search queries to multiple
> servers, and using threads to make the external processes asynchronous.
> C-g would come into play when one or more of those processes hangs or is
> slow, and the user loses patience and wants to quit. The desired result
> would be that whichever thread we're currently waiting on gets killed,
> and the other threads continue.
AFAIK, this should indeed happen, at least mostly.
> Ideally there would be a message noting which search process was
> abandoned, which is another reason to use condition-case.
You mean condition-case in the thread function?
> accept-process-output is given no timeout. So when we hit the first
> `thread-join', we wait for the first accept-process-output to return
> completely, putting all its output in its process buffer. While it's
> doing that, output from the second and third thread processes is also
> arriving on the wire, but it's being buffered in C code or in the
> process itself or in some other special non-Lisp place (I'm making this
> part up, I have no idea).
Not exactly. While the first thread waits for output, we let some
other thread run, until that other thread starts waiting as well. The
first thread whose wait is over will become active again, because the
main thread is waiting for thread-join. IOW, the main thread waits in
thread-join, whereas the other threads wait in accept-process-output.
I think.
> I'm trying to think about what would happen if we looped the
> `accept-process-output' on say a half-second timeout. When the first
> `thread-join' is called, does it mean all three processes would start
> getting half-second opportunities to write process output into their
> output buffers? Or would the second and third threads not get to do
> their `accept-process-output' calls at all until they were joined?
The first thread runs when the first thread-join is called by the main
thread. The second thread gets run when the first thread calls
accept-process-output. Etc. with the other threads.
I think there could be a problem if a thread finishes accepting its
output before its thread-join was called.
> Realistically, the user would be unlikely to quit unless one of the
> processes was taking a very long time, at which point that would be the
> only running thread, and probably the right thing would happen.
The problematic scenario is when the main thread gets the C-g. I'm
not sure this couldn't happen in your setup.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-22 20:06 ` Eli Zaretskii
@ 2017-04-22 22:50 ` Eric Abrahamsen
2017-04-30 17:46 ` Eric Abrahamsen
0 siblings, 1 reply; 24+ messages in thread
From: Eric Abrahamsen @ 2017-04-22 22:50 UTC (permalink / raw)
To: emacs-devel
Eli Zaretskii <eliz@gnu.org> writes:
[...]
>> Ideally there would be a message noting which search process was
>> abandoned, which is another reason to use condition-case.
>
> You mean condition-case in the thread function?
My original assumptions about how things work have mostly turned out
wrong. So sure, inside the thread function! I need to set up some dummy
shell programs and test this.
>> accept-process-output is given no timeout. So when we hit the first
>> `thread-join', we wait for the first accept-process-output to return
>> completely, putting all its output in its process buffer. While it's
>> doing that, output from the second and third thread processes is also
>> arriving on the wire, but it's being buffered in C code or in the
>> process itself or in some other special non-Lisp place (I'm making this
>> part up, I have no idea).
>
> Not exactly. While the first thread waits for output, we let some
> other thread run, until that other thread starts waiting as well. The
> first thread whose wait is over will become active again, because the
> main thread is waiting for thread-join. IOW, the main thread waits in
> thread-join, whereas the other threads wait in accept-process-output.
> I think.
[...]
> The first thread runs when the first thread-join is called by the main
> thread. The second thread gets run when the first thread calls
> accept-process-output. Etc. with the other threads.
Okay so this was a key point of confusion for me, and I think I've
finally got it, thank you. I'd misunderstood what Noam was telling me
about the scope of the "results" variable: I thought it needed to be
visible because the thread function started execution immediately, but
he was telling me it needed to be visible because it was captured in the
thread-function closure. Execution doesn't start until the main thread
yields.
So under normal circumstances (processes that take a non-negligible
amount of time to return) we'd do the first thread-join, then all three
threads would end up waiting in accept-process-output. What happens next
could depend on whether thread two or three came out of
accept-process-output before thread one did. As you say:
> I think there could be a problem if a thread finishes accepting its
> output before its thread-join was called.
It shouldn't be too hard to create this test condition.
>> Realistically, the user would be unlikely to quit unless one of the
>> processes was taking a very long time, at which point that would be the
>> only running thread, and probably the right thing would happen.
>
> The problematic scenario is when the main thread gets the C-g. I'm
> not sure this couldn't happen in your setup.
That could happen if the quit came between two of the calls to
`thread-join', yes? I suppose I could do something with inhibit-quit
around the mapc, but that's getting ahead of myself a bit.
Thanks again,
Eric
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-22 22:50 ` Eric Abrahamsen
@ 2017-04-30 17:46 ` Eric Abrahamsen
0 siblings, 0 replies; 24+ messages in thread
From: Eric Abrahamsen @ 2017-04-30 17:46 UTC (permalink / raw)
To: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 4258 bytes --]
Eric Abrahamsen <eric@ericabrahamsen.net> writes:
> Eli Zaretskii <eliz@gnu.org> writes:
>
> [...]
>
>>> Ideally there would be a message noting which search process was
>>> abandoned, which is another reason to use condition-case.
>>
>> You mean condition-case in the thread function?
>
> My original assumptions about how things work have mostly turned out
> wrong. So sure, inside the thread function! I need to set up some dummy
> shell programs and test this.
I finally got time to test this. I'm attaching a python script that I
used as the external process, and pasting below the code chunk I used
for testing. This is with emacs -Q, built from master this morning. I
opened a window on each of the three process buffers, and watched the
results come in. I'm not sure the `redisplay's are necessary, or a valid
measure of process response time, but it helped with eyeballing it.
Notes:
1. At first, I made the dumb mistake of writing "(dolist (t threads)" in
the final loop. This caused emacs to segfault, and output the
"attempt to set a constant" error on the command line. Obviously this
is wrong, but it probably shouldn't segfault.
2. I tweaked the sleep time parameters in various ways, but so far as I
can tell, output was returned correctly in all cases, even when the
first thread was given the longest sleep time. When the earlier
threads had shorter timeouts, sometimes the redisplay showed output
coming in to their buffer buffer, sometimes it didn't. For my
purposes this doesn't matter.
3. Keyboard quit does nothing at all. Nothing is interrupted, everything
returns as normal.
So I played a bit with quitting. First, in the final dolist, I wrapped
each `thread-join' in a condition-case, which caught quit and used
`thread-signal' to send the quit to the thread.
The result was that the `thread-join' was quit, but not the thread or
its process. Ie, emacs stopped waiting on that thread and moved on to
the next one, but the process output still came in, and was inserted
into the correct buffer. Not too surprising, since the thread function
itself doesn't have any reason to pay attention to 'quit. I suppose that
this is okay in this setup, because the buffer has to exist: if the
buffer were deleted after the thread-join loop, the process would also
die.
But what about my actual use-case, where each thread is appending to the
value of a let-bound variable that is closed over in the thread
function? Say the longest thread-join is quit, the shorter thread-joins
return, and execution continues on in the main thread. We move out of
scope for the let-bound return variable, and then the last remaining
thread tries to set that variable. I'm guessing it'll segfault, but I
didn't try.
Then I added a second condition-case inside each thread function,
wrapping the `accept-process-output' call, catching quit, and using it
to call `kill-process' on "proc". So a keyboard quit gets first sent to
the thread function, and then on to the thread process. That behaved
pretty much the way I hoped it would, more or less. It was a crapshoot
which thread/process got killed, but they did get killed. Sometimes I
had to hit "C-g" several times before anything happened though. I wonder
if messing with `with-local-quit' or something could make that more
predictable.
Anyway, I found all this interesting -- hope it's useful to someone
else.
#+BEGIN_SRC emacs-lisp
(setq lexical-binding t)
(defvar test-threads)
(defvar thread-test-prog)
;; Name of thread, process buffer, seconds for thread-test-prog to
;; sleep.
(setq test-threads `(("one" ,(get-buffer-create "*thread one*") "2")
("two" ,(get-buffer-create "*thread two*") "10")
("three" ,(get-buffer-create "*thread three*") "3")))
(setq thread-test-prog (expand-file-name "~/.bin/threadtest.py"))
(let ((threads
(mapcar
(lambda (el)
(make-thread
(lambda ()
(let ((proc (start-process
(car el) (cadr el) thread-test-prog
"-t" (car el) "-s" (caddr el))))
(accept-process-output proc)))
(car el)))
test-threads)))
(dolist (el test-threads)
(with-current-buffer (cadr el)
(erase-buffer)))
(dolist (th threads)
(redisplay)
(thread-join th)
(redisplay)))
#+END_SRC
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: threadtest.py --]
[-- Type: text/x-python, Size: 365 bytes --]
#!/usr/bin/env python3
import time
import argparse
defsecs = 3
parser = argparse.ArgumentParser()
parser.add_argument("-t", "--thread")
parser.add_argument("-s", "--seconds", type=int)
args = parser.parse_args()
if not args.seconds:
args.seconds = defsecs
time.sleep(args.seconds)
print("Process %s output (%s)" % (args.thread, time.strftime("%H:%M:%S")))
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-22 19:25 ` Eric Abrahamsen
2017-04-22 20:06 ` Eli Zaretskii
@ 2017-04-24 17:17 ` Stephen Leake
2017-04-26 9:42 ` Eli Zaretskii
1 sibling, 1 reply; 24+ messages in thread
From: Stephen Leake @ 2017-04-24 17:17 UTC (permalink / raw)
To: emacs-devel
Eric Abrahamsen <eric@ericabrahamsen.net> writes:
> Noam Postavsky <npostavs@users.sourceforge.net> writes:
>
> Basically we're sending search queries to multiple
> servers, and using threads to make the external processes asynchronous.
> C-g would come into play when one or more of those processes hangs or is
> slow, and the user loses patience and wants to quit. The desired result
> would be that whichever thread we're currently waiting on gets killed,
There should never be _one_ thread that you are waiting on (except when
there is only one left, of course); you should always be waiting for
_any_ thread to respond. Otherwise, you don't have a truly asynchronous
system; you have a polled synchronous system.
So the user would be saying "kill all outstanding threads".
--
-- Stephe
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-24 17:17 ` Stephen Leake
@ 2017-04-26 9:42 ` Eli Zaretskii
0 siblings, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2017-04-26 9:42 UTC (permalink / raw)
To: Stephen Leake; +Cc: emacs-devel
> From: Stephen Leake <stephen_leake@stephe-leake.org>
> Date: Mon, 24 Apr 2017 12:17:37 -0500
>
> > Basically we're sending search queries to multiple
> > servers, and using threads to make the external processes asynchronous.
> > C-g would come into play when one or more of those processes hangs or is
> > slow, and the user loses patience and wants to quit. The desired result
> > would be that whichever thread we're currently waiting on gets killed,
>
> There should never be _one_ thread that you are waiting on (except when
> there is only one left, of course); you should always be waiting for
> _any_ thread to respond.
Who is "you" in this context that is waiting? And what is meant by
"respond"?
> Otherwise, you don't have a truly asynchronous system; you have a
> polled synchronous system.
Emacs Lisp threads are indeed not a truly asynchronous system, and
cannot be used to produce any such system, because only one Lisp
thread can be running at any given time.
> So the user would be saying "kill all outstanding threads".
The application could catch C-g and kill threads. But the question
was what should C-g do without any application code, and the answer
cannot be kill all Lisp threads, because only one thread could ever
receive the C-g keystroke and act on it.
So maybe there's some mismatch of expectations here.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-22 15:08 ` Eric Abrahamsen
2017-04-22 15:17 ` Eli Zaretskii
@ 2017-04-22 16:00 ` Noam Postavsky
1 sibling, 0 replies; 24+ messages in thread
From: Noam Postavsky @ 2017-04-22 16:00 UTC (permalink / raw)
To: Eric Abrahamsen; +Cc: Emacs developers
On Sat, Apr 22, 2017 at 11:08 AM, Eric Abrahamsen
<eric@ericabrahamsen.net> wrote:
>> How would you want threads to interact with C-g?
>
> I'm still trying to get a correct mental model of how all this is
> working. I assume that, if I gather the threads using:
>
> (mapc #'thread-join threads)
>
> None of the threads ever become the "current thread",
Wouldn't thead-join stop the current thread, and then whichever thread
runs next (probably one of the ones in `threads') would become the
current thread?
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-22 0:16 ` Andrew Cohen
2017-04-22 5:27 ` Eric Abrahamsen
@ 2017-04-22 7:50 ` Eli Zaretskii
2017-04-22 8:00 ` Andrew Cohen
1 sibling, 1 reply; 24+ messages in thread
From: Eli Zaretskii @ 2017-04-22 7:50 UTC (permalink / raw)
To: Andrew Cohen; +Cc: ding, emacs-devel
> From: Andrew Cohen <cohen@bu.edu>
> Date: Sat, 22 Apr 2017 08:16:08 +0800
> Cc: ding@gnus.org
>
> The bad news is that after weeks of trying I still have no git
> access :( I have sent 2 emails to this list, and on the advice of
> several people I have signed up for a savannah account and requested
> membership in emacs, but its been almost a week with no response.
I approved you now. Sorry for the delay; John usually handles these
requests, but I guess he's still not out of the woods with his email
backlog.
(In the future, and for others' knowledge as well: if your schedule is
for some reason tight, please tell that explicitly in your request, so
that those who read the request could act in a timely manner.)
Thanks.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-22 7:50 ` Eli Zaretskii
@ 2017-04-22 8:00 ` Andrew Cohen
0 siblings, 0 replies; 24+ messages in thread
From: Andrew Cohen @ 2017-04-22 8:00 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Andrew Cohen, emacs-devel, ding
>>>>> "Eli" == Eli Zaretskii <eliz@gnu.org> writes:
>> From: Andrew Cohen <cohen@bu.edu> Date: Sat, 22 Apr 2017 08:16:08
>> +0800 Cc: ding@gnus.org
>>
>> The bad news is that after weeks of trying I still have no git
>> access :( I have sent 2 emails to this list, and on the advice of
>> several people I have signed up for a savannah account and
>> requested membership in emacs, but its been almost a week with no
>> response.
Eli> I approved you now. Sorry for the delay; John usually handles
Eli> these requests, but I guess he's still not out of the woods
Eli> with his email backlog.
Thanks very much. I completely understand. I noticed that John hasn't
posted much recently on the list and assumed he must be busy.
Eli> (In the future, and for others' knowledge as well: if your
Eli> schedule is for some reason tight, please tell that explicitly
Eli> in your request, so that those who read the request could act
Eli> in a timely manner.)
You are right, I should have mentioned it. But the upside is I've had
more time on my own to ensure that all my code is entirely bug free :)
Best,
Andy
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-21 21:35 [RFC] Gnus generalized search, part II Eric Abrahamsen
2017-04-22 0:16 ` Andrew Cohen
@ 2017-04-22 19:53 ` Lars Ingebrigtsen
2017-04-22 20:26 ` Eric Abrahamsen
2017-04-24 20:30 ` Eric Abrahamsen
2017-04-23 13:48 ` Dan Christensen
2 siblings, 2 replies; 24+ messages in thread
From: Lars Ingebrigtsen @ 2017-04-22 19:53 UTC (permalink / raw)
To: Eric Abrahamsen; +Cc: ding, emacs-devel
Eric Abrahamsen <eric@ericabrahamsen.net> writes:
> The query entered by the user is parsed into a sexp structure, and then
> each engine is responsible for interpreting that.
I think this sounds like a good approach. I haven't tried the code
myself, but I skimmed it briefly and it looks good to me. :-)
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-22 19:53 ` Lars Ingebrigtsen
@ 2017-04-22 20:26 ` Eric Abrahamsen
2017-04-24 20:30 ` Eric Abrahamsen
1 sibling, 0 replies; 24+ messages in thread
From: Eric Abrahamsen @ 2017-04-22 20:26 UTC (permalink / raw)
To: emacs-devel; +Cc: ding
Lars Ingebrigtsen <larsi@gnus.org> writes:
> Eric Abrahamsen <eric@ericabrahamsen.net> writes:
>
>> The query entered by the user is parsed into a sexp structure, and then
>> each engine is responsible for interpreting that.
>
> I think this sounds like a good approach. I haven't tried the code
> myself, but I skimmed it briefly and it looks good to me. :-)
Cool! Glad it's acceptable in theory. I am hoping that people who are
likely to care about this stuff will argue about the search language
syntax a bit. That will affect users the most, and also be the most
annoying to change subsequently. The rest is just bugs :)
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-22 19:53 ` Lars Ingebrigtsen
2017-04-22 20:26 ` Eric Abrahamsen
@ 2017-04-24 20:30 ` Eric Abrahamsen
2017-04-26 4:41 ` Andrew Cohen
` (2 more replies)
1 sibling, 3 replies; 24+ messages in thread
From: Eric Abrahamsen @ 2017-04-24 20:30 UTC (permalink / raw)
To: emacs-devel; +Cc: ding
Lars Ingebrigtsen <larsi@gnus.org> writes:
> Eric Abrahamsen <eric@ericabrahamsen.net> writes:
>
>> The query entered by the user is parsed into a sexp structure, and then
>> each engine is responsible for interpreting that.
>
> I think this sounds like a good approach. I haven't tried the code
> myself, but I skimmed it briefly and it looks good to me. :-)
Okay, I've pushed these changes as the scratch/nnir-search branch.
This branch is mostly the same as the nnir.el file I posted last time,
but with more switches for turning query parsing on and off. In the
storied Gnus tradition of over-customization, there are now a grand
total of four ways of controlling whether queries are parsed or raw:
1. The big switch is `nnir-use-parsed-queries'. It is t by default,
but if set to nil, Gnus will behave more or less the way it does
now.
2. If a prefix argument is given to the nnir search command (ie, "C-u
G G" in the *Group* buffer), that search query will not be parsed,
and will be passed raw to all the marked servers/groups.
3. Individual search engines can be told never to parse search
queries, by specifying the `raw-queries-p' parameter to engine
creation. If multiple groups are marked for searching, the query
will be parsed for groups with engines that allow it, and not for
engines that don't.
4. Entire classes of engines can be marked never to parse queries, by
setting variables like nnir-notmuch-raw-queries-p, with "notmuch"
replaced by the various engine names. Again, queries to multiple
engines will still be parsed by engines that allow it.
I do hope people will test this. Actually testing that search groups
behave correctly is of course important, but if you just want to fool
with the search language and see how it is parsed, and transformed, you
can use stuff like this:
#+BEGIN_SRC elisp
(let* ((query-string "subject:gnus or since:1w")
(parsed-query (nnir-search-parse-query query-string))
(test-imap (make-instance 'gnus-search-imap))
(test-notmuch (make-instance 'gnus-search-notmuch)))
(message "notmuch query: %s\nimap query: %s"
(nnir-search-transform-top-level test-imap parsed-query)
(nnir-search-transform-top-level test-notmuch parsed-query)))
#+END_SRC
`nnir-search-parse-query' turns strings into sexps, and
`nnir-search-transform-top-level' turns the sexps back into
engine-specific strings -- it requires an engine instance as the first
argument.
All the engines are named gnus-search-*. There are more on the way.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-24 20:30 ` Eric Abrahamsen
@ 2017-04-26 4:41 ` Andrew Cohen
2017-04-26 9:21 ` Joakim Jalap
2017-04-26 8:18 ` Andrew Cohen
2017-04-26 8:22 ` Andrew Cohen
2 siblings, 1 reply; 24+ messages in thread
From: Andrew Cohen @ 2017-04-26 4:41 UTC (permalink / raw)
To: emacs-devel; +Cc: ding
Hi Eric:
I noticed you have eliminated the 'shortcut option in nnir. This was
important functionality which should probably be restored (stopping
after the first match rather than finding all matches). For example its
used in article referral (when the referral methods include 'nnir) where
we are just trying to find a particular message-id.
I originally implemented this as part of the criteria in the search
query. Since these are going away in the universal search form the
shortcut implementation will be different. Maybe just include another
search term, like "?:" and "*:" for match-one and match-any, with "*:"
the default?
And on the long list of things that would be nice: when inputting the
universal query string TAB should complete on the preset search keys.
If I want to search for an author the current nnir allows this with only
a few keystrokes: C-u G G; "author-name" RET; f TAB; RET. gnus-search
should allow something similar.
Best,
Andy
(Who is beginning to think that you and I are the only two who will ever
use this new stuff).
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-26 4:41 ` Andrew Cohen
@ 2017-04-26 9:21 ` Joakim Jalap
0 siblings, 0 replies; 24+ messages in thread
From: Joakim Jalap @ 2017-04-26 9:21 UTC (permalink / raw)
To: Andrew Cohen; +Cc: emacs-devel, ding
Andrew Cohen <cohen@bu.edu> writes:
> Andy
>
> (Who is beginning to think that you and I are the only two who will ever
> use this new stuff).
Just wanted to say that I will use it for sure :) It's just that gnus is
so complex I have trouble getting anything to work at all. Right now I'm
struggling with gnus-cloud. I've never really gotten search in gnus to
work for me at all, but I think this generalized search stuff might make
it much easier.
I will try to give it a spin during the weekend if I can.
-- Joakim
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-24 20:30 ` Eric Abrahamsen
2017-04-26 4:41 ` Andrew Cohen
@ 2017-04-26 8:18 ` Andrew Cohen
2017-04-26 8:22 ` Andrew Cohen
2 siblings, 0 replies; 24+ messages in thread
From: Andrew Cohen @ 2017-04-26 8:18 UTC (permalink / raw)
To: emacs-devel; +Cc: ding
I found a few minutes to try the generalized search but couldn't get the
file to load :( I suspect just some simple typos, but I haven't had the
time to track them down (and they all seem to be in search backends that
aren't functional anyway).
I made some a (random) guess about what changes to make and got things
to load, but I haven't tried running any functions yet:)
Eric, did I make the right changes?
--- nnir_search.el~ 2017-04-22 17:03:10.509173535 +0800
+++ nnir_search.el 2017-04-26 16:13:25.806384881 +0800
@@ -1064,7 +1064,7 @@
(eieio-oset-default 'gnus-search-swish-e 'prefix
nnir-swish-e-remove-prefix)
-(eieio-oset-default 'gnus-search-swish-e 'config-file
+(eieio-oset-default 'gnus-search-swish-e 'index-files
nnir-swish-e-index-files)
(eieio-oset-default 'gnus-search-swish-e 'switches
@@ -1499,7 +1499,7 @@
-(cl-defmethod nnir-search-transform-expression ((engine gnus-engine-swish++)
+(cl-defmethod nnir-search-transform-expression ((engine gnus-search-swish++)
(expr (head near)))
(format "%s near %s"
(nnir-search-transform-expression engine (nth 1 expr))
@@ -1946,20 +1946,20 @@
(forward-line 1)))
(apply #'vector (nreverse (delete-dups artlist)))))
-(cl-defmethod nnir-search-transform-expression ((_e gnus-engine-gmane)
+(cl-defmethod nnir-search-transform-expression ((_e gnus-search-gmane)
(_expr (head near)))
nil)
;; Can Gmane handle OR or NOT keywords?
-(cl-defmethod nnir-search-transform-expression ((_e gnus-engine-gmane)
+(cl-defmethod nnir-search-transform-expression ((_e gnus-search-gmane)
(_expr (head or)))
nil)
-(cl-defmethod nnir-search-transform-expression ((_e gnus-engine-gmane)
+(cl-defmethod nnir-search-transform-expression ((_e gnus-search-gmane)
(_expr (head not)))
nil)
-(cl-defmethod nnir-search-transform-expression ((_e gnus-engine-gmane)
+(cl-defmethod nnir-search-transform-expression ((_e gnus-search-gmane)
(expr list))
"The only keyword value gmane can handle is author, ie from."
(when (memq (car expr) '(from sender author))
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-24 20:30 ` Eric Abrahamsen
2017-04-26 4:41 ` Andrew Cohen
2017-04-26 8:18 ` Andrew Cohen
@ 2017-04-26 8:22 ` Andrew Cohen
2 siblings, 0 replies; 24+ messages in thread
From: Andrew Cohen @ 2017-04-26 8:22 UTC (permalink / raw)
To: ding; +Cc: emacs-devel
OK, those changes seemed to have worked and the example code works fine
(except that the notmuch and imap printout is reversed :))
I'll play around with more complex queries in a while.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC] Gnus generalized search, part II
2017-04-21 21:35 [RFC] Gnus generalized search, part II Eric Abrahamsen
2017-04-22 0:16 ` Andrew Cohen
2017-04-22 19:53 ` Lars Ingebrigtsen
@ 2017-04-23 13:48 ` Dan Christensen
2 siblings, 0 replies; 24+ messages in thread
From: Dan Christensen @ 2017-04-23 13:48 UTC (permalink / raw)
To: emacs-devel; +Cc: ding
Looks interesting. Two questions:
Will mairix be supported?
How do you handle differing capabilities of the search backends?
E.g. mairix supports substring searches, and fuzzy matches, which
are handy if you are looking for a word that might be singular or
plural, or might be a noun or a verb, for example.
Dan
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2017-04-30 17:46 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-21 21:35 [RFC] Gnus generalized search, part II Eric Abrahamsen
2017-04-22 0:16 ` Andrew Cohen
2017-04-22 5:27 ` Eric Abrahamsen
2017-04-22 8:08 ` Eli Zaretskii
2017-04-22 15:08 ` Eric Abrahamsen
2017-04-22 15:17 ` Eli Zaretskii
2017-04-22 15:25 ` Eli Zaretskii
2017-04-22 19:25 ` Eric Abrahamsen
2017-04-22 20:06 ` Eli Zaretskii
2017-04-22 22:50 ` Eric Abrahamsen
2017-04-30 17:46 ` Eric Abrahamsen
2017-04-24 17:17 ` Stephen Leake
2017-04-26 9:42 ` Eli Zaretskii
2017-04-22 16:00 ` Noam Postavsky
2017-04-22 7:50 ` Eli Zaretskii
2017-04-22 8:00 ` Andrew Cohen
2017-04-22 19:53 ` Lars Ingebrigtsen
2017-04-22 20:26 ` Eric Abrahamsen
2017-04-24 20:30 ` Eric Abrahamsen
2017-04-26 4:41 ` Andrew Cohen
2017-04-26 9:21 ` Joakim Jalap
2017-04-26 8:18 ` Andrew Cohen
2017-04-26 8:22 ` Andrew Cohen
2017-04-23 13:48 ` Dan Christensen
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).