* [PATCH RFC] index: add body: search query term
@ 2018-10-10 5:53 William Casarin
2018-10-10 10:43 ` David Bremner
0 siblings, 1 reply; 4+ messages in thread
From: William Casarin @ 2018-10-10 5:53 UTC (permalink / raw)
To: notmuch
This adds the ability to search specifically on the body
eg.
notmuch search tag:notmuch and body:PATCH
Signed-off-by: William Casarin <jb55@jb55.com>
---
Hey there,
I'm looking to add the ability to search specifically on the body. I
was poking around in the indexer, added these lines and reindexed a
few tags. It appears to work!
I was just wondering if there's anything I'm missing? That seemed a
bit too easy. I noticed there are some NOTMUCH_FIELDS that I'm not
sure what they do.
If anyone has any xapian knowledge that could shine some insight into
what the next steps might be, if any.
Thanks!
Will
lib/database.cc | 3 +++
lib/index.cc | 2 +-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/lib/database.cc b/lib/database.cc
index 9cf8062c..0b085b21 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -297,6 +297,9 @@ prefix_t prefix_table[] = {
{ "subject", "XSUBJECT", NOTMUCH_FIELD_EXTERNAL |
NOTMUCH_FIELD_PROBABILISTIC |
NOTMUCH_FIELD_PROCESSOR},
+ { "body", "XBODY", NOTMUCH_FIELD_EXTERNAL |
+ NOTMUCH_FIELD_PROBABILISTIC |
+ NOTMUCH_FIELD_PROCESSOR},
};
static void
diff --git a/lib/index.cc b/lib/index.cc
index 3f694387..299b8770 100644
--- a/lib/index.cc
+++ b/lib/index.cc
@@ -506,7 +506,7 @@ _index_mime_part (notmuch_message_t *message,
body = (char *) g_byte_array_free (byte_array, false);
if (body) {
- _notmuch_message_gen_terms (message, NULL, body);
+ _notmuch_message_gen_terms (message, "body", body);
free (body);
}
--
2.19.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH RFC] index: add body: search query term
2018-10-10 5:53 [PATCH RFC] index: add body: search query term William Casarin
@ 2018-10-10 10:43 ` David Bremner
2018-10-10 16:34 ` William Casarin
0 siblings, 1 reply; 4+ messages in thread
From: David Bremner @ 2018-10-10 10:43 UTC (permalink / raw)
To: William Casarin, notmuch
William Casarin <jb55@jb55.com> writes:
>
> lib/database.cc | 3 +++
> lib/index.cc | 2 +-
> 2 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/lib/database.cc b/lib/database.cc
> index 9cf8062c..0b085b21 100644
> --- a/lib/database.cc
> +++ b/lib/database.cc
> @@ -297,6 +297,9 @@ prefix_t prefix_table[] = {
> { "subject", "XSUBJECT", NOTMUCH_FIELD_EXTERNAL |
> NOTMUCH_FIELD_PROBABILISTIC |
> NOTMUCH_FIELD_PROCESSOR},
> + { "body", "XBODY", NOTMUCH_FIELD_EXTERNAL |
> + NOTMUCH_FIELD_PROBABILISTIC |
> + NOTMUCH_FIELD_PROCESSOR},
> };
>
> static void
> diff --git a/lib/index.cc b/lib/index.cc
> index 3f694387..299b8770 100644
> --- a/lib/index.cc
> +++ b/lib/index.cc
> @@ -506,7 +506,7 @@ _index_mime_part (notmuch_message_t *message,
> body = (char *) g_byte_array_free (byte_array, false);
>
> if (body) {
> - _notmuch_message_gen_terms (message, NULL, body);
> + _notmuch_message_gen_terms (message, "body", body);
>
> free (body);
> }
> --
I think you'll find you broke non-prefixed queries. Does the test suite
still pass? If so, we need more tests. Anyway, if you add a second set
of terms I'd be intersted how much this bloats the index. Ideally with
the performance corpus so we can all reproduce the experiment.
d
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH RFC] index: add body: search query term
2018-10-10 10:43 ` David Bremner
@ 2018-10-10 16:34 ` William Casarin
2018-10-10 16:36 ` William Casarin
0 siblings, 1 reply; 4+ messages in thread
From: William Casarin @ 2018-10-10 16:34 UTC (permalink / raw)
To: David Bremner, notmuch
David Bremner <david@tethera.net> writes:
> William Casarin <jb55@jb55.com> writes:
> I think you'll find you broke non-prefixed queries. Does the test suite
> still pass? If so, we need more tests.
yeah they seem to pass. but you're right, something seems a bit off:
./notmuch count subject:github or body:github and tag:notmuch
3271
./notmuch count github and tag:notmuch
665
> of terms I'd be intersted how much this bloats the index. Ideally with
> the performance corpus so we can all reproduce the experiment.
sounds good, I was wondering that as well.
I wonder if it's all worth the effort though, since a workaround could
be:
notmuch search <query> and not subject:<query>
If it's too annoying to have a body prefix, due to index bloat or
performance issues, would doing something hacky such as translating
'body:<query>' to '<query> and not subject:<query>' make sense?
Will
--
https://jb55.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH RFC] index: add body: search query term
2018-10-10 16:34 ` William Casarin
@ 2018-10-10 16:36 ` William Casarin
0 siblings, 0 replies; 4+ messages in thread
From: William Casarin @ 2018-10-10 16:36 UTC (permalink / raw)
To: David Bremner, notmuch
William Casarin <jb55@jb55.com> writes:
> I wonder if it's all worth the effort though, since a workaround could
> be:
>
> notmuch search <query> and not subject:<query>
>
> If it's too annoying to have a body prefix, due to index bloat or
> performance issues, would doing something hacky such as translating
> 'body:<query>' to '<query> and not subject:<query>' make sense?
Thinking about this some more, this is not exactly the same, since this
would explicitly exclude subjects, whereas the body query wouldn't care
what the subject was.
--
https://jb55.com
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-10-10 16:36 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-10-10 5:53 [PATCH RFC] index: add body: search query term William Casarin
2018-10-10 10:43 ` David Bremner
2018-10-10 16:34 ` William Casarin
2018-10-10 16:36 ` William Casarin
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).