From: Austin Clements <amdragon@MIT.EDU>
To: Sebastien Binet <binet@cern.ch>
Cc: Notmuch developer list <notmuch@notmuchmail.org>
Subject: Re: query on a subset of messages ?
Date: Mon, 9 Jul 2012 12:30:00 -0400 [thread overview]
Message-ID: <20120709163000.GG18195@mit.edu> (raw)
In-Reply-To: <871ukl5oj7.fsf@cern.ch>
Quoth Sebastien Binet on Jul 09 at 10:25 am:
>
> hi there,
>
> I was trying to reduce the I/O stress during my usual email
> fetching+tagging by writing a little program using the go bindings to
> notmuch.
>
> ie:
> db, status := notmuch.OpenDatabase(db_path,
> notmuch.DATABASE_MODE_READ_WRITE)
> query := db.CreateQuery("(tag:new AND tag:inbox)")
> msgs := query.SearchMessages()
> for _,msg := range msgs {
> tag_msg(msg, tagqueries)
> }
>
>
> where tagqueries is a subquery of the form:
> [
> {
> "Cmd": "+to-me",
> "Query": "(to:sebastien.binet@cern.ch and not tag:to-me)"
> },
> {
> "Cmd": "+sci-notmuch",
> "Query": "from:notmuch@notmuchmail.org or to:notmuch@notmuchmail.org or subject:notmuch"
> }
> ]
>
>
> the idea being that I only need to crawl through the db only once and
> then iteratively apply tags on those messages (instead of repeatedly
> running "notmuch tag ..." for each and every of those many
> 'tag-queries')
>
> I couldn't find any C-API to do such a thing using the notmuch library.
> did I overlook something ?
>
> Is it something useful to add ?
>
> -s
Have you tried a more direct translation of the multiple notmuch tag
commands into Go, where you don't worry about subsetting the queries?
Unless you're tagging a huge number of messages, the cost of notmuch
tag is almost certainly the fsync that it does when it closes the
database (which every call to notmuch tag must do). However, in Go,
you can keep the database open across all of the tagging operations
and then close and fsync it just once.
Note that there is an important optimization in notmuch tag that you
might have to replicate. It manipulates the original query to exclude
messages that already have the desired tags, so that they get skipped
very efficiently at the earliest stage possible.
next prev parent reply other threads:[~2012-07-09 16:30 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-09 8:25 query on a subset of messages ? Sebastien Binet
2012-07-09 15:55 ` Jameson Graef Rollins
2012-07-09 16:45 ` Sebastien Binet
2012-07-09 17:04 ` post-new [was: Re: query on a subset of messages ?] Sebastien Binet
2012-07-09 17:11 ` Jameson Graef Rollins
2012-07-09 17:37 ` Jani Nikula
2012-07-10 9:59 ` Sebastien Binet
2012-07-10 16:48 ` Jameson Graef Rollins
2012-07-10 13:16 ` Jani Nikula
2012-07-09 16:30 ` Austin Clements [this message]
2012-07-09 17:06 ` query on a subset of messages ? Sebastien Binet
2012-07-19 8:13 ` Sebastien Binet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120709163000.GG18195@mit.edu \
--to=amdragon@mit.edu \
--cc=binet@cern.ch \
--cc=notmuch@notmuchmail.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).