unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [RFC 0/2] add --new-tags= to notmuch-new(1)
@ 2021-09-16 10:25 andreas
  2021-09-16 10:25 ` [RFC 1/2] lib/config: introduce notmuch_config_values_from_string function andreas
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: andreas @ 2021-09-16 10:25 UTC (permalink / raw)
  To: notmuch

I've recently became a little annoyed with a race condition in my current
notmuch setup that originates from only having a single set of new tags for
all notmuch-new(1) invocations. In the past I've mentioned this a couple of
times in the IRC channel and now I got around to implement a basic version of
this.

I rougly process new messages like this:

  1. new messages get the "new" tag through notmuch-new(1),
  2. the post-new hook calls a series of scripts
    a) all mails with the "new" tag are processed by [muchclassy] which
       is responsible for applying tags to all mails from mailing lists
       following the schema list::org.kernel.vger.linux-kernel.
    b) all mails with the "new" tag and a tag that matches
       /^list::com\.github\./ are passed through another script of mine
       that queries the GitHub API and attaches tags (gh::closed,
       gh::merged, ...) to the thread.
    c) a notmuch-tag(1) --batch script is executed on all new mails that
       filters out some nosiy senders, groups mailing lists, filters out
       closed/merged GitHub PRs, ... and adds "unread"/"inbox"/... tags to
       mails that I want to see in my default inbox query.
    d) finally the "new" tag is removed from all mails.

There are right now 8 mailboxes that I am retrieving mails from. I have a
scheduled job that updates all my local Maildir's every couple of minutes.
That one doesn't cause any issues on its own.

But I also use IMAP IDLE to selectively update Maildir's as soon as a new mail
arrives.

If I receive mails on multiple Maildir's within a short period of time the
above process is running into race-conditions since there is no way to
distinguish mails that have just been marked new. All of them carry the same
tags.

In the worst case one of the last steps (2c/2d) would pick up the new mail
before any of the actual classification has been executed. This "leaks" mails
into my inbox which consequently can be overwhelming to look at.

With this series I am able to give each notmuch-new(1) invocation a unique tag
(think: new-$(uuidgen)). This doesn't (on its own) solve the entire story but
is a first step in the "right direction" IMHO. I still have to wrap the entire
workflow to propagate the unique tag to all the sub-commands (via. e.g an
environment variable).

I am posting this as an RFC to see what other users and the developers think
about this approach and if anyone has solved a similar issue (in a different
way).

An alternative that I have considered is using a post-new hook that applies
the unique tag and removes the default new tags. This would probably work but
smells like a workaround / hack.

There are two FIXME's in the docstrings of the new
notmuch_config_values_from_string function as I didn't know what version this
would possible first available in.


Let me know what you think.


Andi


Andreas Rammhold (2):
  lib/config: introduce notmuch_config_values_from_string function
  CLI/notmuch: add --new-tags argument to notmuch-new(1)

 doc/man1/notmuch-new.rst |  7 +++++++
 lib/config.cc            | 12 +++++++++++-
 lib/notmuch.h            | 14 ++++++++++++++
 notmuch-new.c            | 16 +++++++++++++++-
 test/T050-new.sh         |  6 ++++++
 test/T590-libconfig.sh   | 22 ++++++++++++++++++++++
 6 files changed, 75 insertions(+), 2 deletions(-)

-- 
2.32.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC 1/2] lib/config: introduce notmuch_config_values_from_string function
  2021-09-16 10:25 [RFC 0/2] add --new-tags= to notmuch-new(1) andreas
@ 2021-09-16 10:25 ` andreas
  2021-09-16 10:25 ` [RFC 2/2] CLI/notmuch: add --new-tags argument to notmuch-new(1) andreas
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: andreas @ 2021-09-16 10:25 UTC (permalink / raw)
  To: notmuch; +Cc: Andreas Rammhold

From: Andreas Rammhold <andreas@rammhold.de>

This factors out the code required to initialize a
notmuch_config_values_t from a string into a dedicated function. By
exposing this function we can allow consumers (such as notmuch-new(1))
to customize certain configuration values via command line flags.
---
 lib/config.cc          | 12 +++++++++++-
 lib/notmuch.h          | 14 ++++++++++++++
 test/T590-libconfig.sh | 22 ++++++++++++++++++++++
 3 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/lib/config.cc b/lib/config.cc
index 8775b00a..195483a6 100644
--- a/lib/config.cc
+++ b/lib/config.cc
@@ -280,6 +280,16 @@ notmuch_config_get_values (notmuch_database_t *notmuch, notmuch_config_key_t key
 
 notmuch_config_values_t *
 notmuch_config_get_values_string (notmuch_database_t *notmuch, const char *key_str)
+{
+    const char * val = _notmuch_string_map_get (notmuch->config, key_str);
+    if (! val)
+        return NULL;
+
+    return notmuch_config_values_from_string (notmuch, val);
+}
+
+notmuch_config_values_t *
+notmuch_config_values_from_string (notmuch_database_t *notmuch, const char *val)
 {
     notmuch_config_values_t *values = NULL;
     bool ok = false;
@@ -290,7 +300,7 @@ notmuch_config_get_values_string (notmuch_database_t *notmuch, const char *key_s
 
     values->children = talloc_new (values);
 
-    values->string = _notmuch_string_map_get (notmuch->config, key_str);
+    values->string = val;
     if (! values->string)
 	goto DONE;
 
diff --git a/lib/notmuch.h b/lib/notmuch.h
index 546643e8..59dc9dae 100644
--- a/lib/notmuch.h
+++ b/lib/notmuch.h
@@ -2608,6 +2608,20 @@ notmuch_config_get_values (notmuch_database_t *notmuch, notmuch_config_key_t key
 notmuch_config_values_t *
 notmuch_config_get_values_string (notmuch_database_t *notmuch, const char *key);
 
+
+/**
+ * Returns an iterator for a ';'-delimited list of configuration values
+ *
+ * @param[in] notmuch database
+ * @param[in] val configuration value
+ *
+ * @since libnotmuch FIXME (notmuch FIXME)
+ *
+ * @retval NULL in case of error.
+ */
+notmuch_config_values_t *
+notmuch_config_values_from_string (notmuch_database_t *notmuch, const char* val);
+
 /**
  * Is the given 'config_values' iterator pointing at a valid element.
  *
diff --git a/test/T590-libconfig.sh b/test/T590-libconfig.sh
index 59b82a6f..3df2ffec 100755
--- a/test/T590-libconfig.sh
+++ b/test/T590-libconfig.sh
@@ -295,6 +295,28 @@ EOF
 test_expect_equal_file EXPECTED OUTPUT
 restore_database
 
+test_begin_subtest "notmuch_config_values_from_string"
+cat c_head - c_tail <<'EOF' | test_C ${MAIL_DIR} ${NOTMUCH_CONFIG} %NULL%
+{
+    notmuch_config_values_t *values = notmuch_config_values_from_string (db, "x;y;z");
+    for (notmuch_config_values_start (values);
+	 notmuch_config_values_valid (values);
+	 notmuch_config_values_move_to_next (values))
+    {
+	  puts (notmuch_config_values_get (values));
+    }
+}
+EOF
+cat <<'EOF' >EXPECTED
+== stdout ==
+x
+y
+z
+== stderr ==
+EOF
+test_expect_equal_file EXPECTED OUTPUT
+restore_database
+
 test_begin_subtest "notmuch_config_get_values (restart)"
 cat c_head - c_tail <<'EOF' | test_C ${MAIL_DIR} ${NOTMUCH_CONFIG} %NULL%
 {
-- 
2.32.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC 2/2] CLI/notmuch: add --new-tags argument to notmuch-new(1)
  2021-09-16 10:25 [RFC 0/2] add --new-tags= to notmuch-new(1) andreas
  2021-09-16 10:25 ` [RFC 1/2] lib/config: introduce notmuch_config_values_from_string function andreas
@ 2021-09-16 10:25 ` andreas
  2021-09-16 12:03 ` [RFC 0/2] add --new-tags= " Michael J Gruber
  2021-09-20 23:57 ` David Bremner
  3 siblings, 0 replies; 7+ messages in thread
From: andreas @ 2021-09-16 10:25 UTC (permalink / raw)
  To: notmuch; +Cc: Andreas Rammhold

From: Andreas Rammhold <andreas@rammhold.de>

This introduces a new argument to notmuch-new(1) that allows specfying
the tags that should be applied to new messages. The new option takes
precedence over the new.tags setting configured in the configuration
file.

The need for this option did arise while due to race-conditions between
notmuch tagging messages as new, running my post-new hook and unsetting the
new tag as last step of my processing.
---
 doc/man1/notmuch-new.rst |  7 +++++++
 notmuch-new.c            | 16 +++++++++++++++-
 test/T050-new.sh         |  6 ++++++
 3 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/doc/man1/notmuch-new.rst b/doc/man1/notmuch-new.rst
index 9cb4a54e..d1ed5536 100644
--- a/doc/man1/notmuch-new.rst
+++ b/doc/man1/notmuch-new.rst
@@ -78,6 +78,13 @@ Supported options for **new** include
    to optimize the scanning of directories for new mail. This option turns
    that optimization off.
 
+.. option:: --new-tags=<tags>
+
+   If set defines an alternative value for the new tags that will be applied to
+   all newly added messages.
+
+   See also ``new.tags`` in :any:`notmuch-config(1)`.
+
 EXIT STATUS
 ===========
 
diff --git a/notmuch-new.c b/notmuch-new.c
index b7a5f2ea..80ef35d5 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -1125,6 +1125,8 @@ notmuch_new_command (notmuch_database_t *notmuch, int argc, char *argv[])
     bool timer_is_active = false;
     bool hooks = true;
     bool quiet = false, verbose = false;
+    const char *new_tags = NULL;
+    notmuch_config_values_t *new_tags_config = NULL;
     notmuch_status_t status;
 
     notmuch_opt_desc_t options[] = {
@@ -1133,6 +1135,7 @@ notmuch_new_command (notmuch_database_t *notmuch, int argc, char *argv[])
 	{ .opt_bool = &add_files_state.debug, .name = "debug" },
 	{ .opt_bool = &add_files_state.full_scan, .name = "full-scan" },
 	{ .opt_bool = &hooks, .name = "hooks" },
+	{ .opt_string = &new_tags, .name = "new-tags" },
 	{ .opt_inherit = notmuch_shared_indexing_options },
 	{ .opt_inherit = notmuch_shared_options },
 	{ }
@@ -1150,7 +1153,15 @@ notmuch_new_command (notmuch_database_t *notmuch, int argc, char *argv[])
     else if (verbose)
 	add_files_state.verbosity = VERBOSITY_VERBOSE;
 
-    add_files_state.new_tags = notmuch_config_get_values (notmuch, NOTMUCH_CONFIG_NEW_TAGS);
+    if (! new_tags) {
+	add_files_state.new_tags = notmuch_config_get_values (notmuch, NOTMUCH_CONFIG_NEW_TAGS);
+    } else {
+	new_tags_config = notmuch_config_values_from_string (notmuch, new_tags);
+	if (! new_tags_config)
+	    return EXIT_FAILURE;
+
+	add_files_state.new_tags = new_tags_config;
+    }
 
     if (print_status_database (
 	    "notmuch new",
@@ -1290,6 +1301,9 @@ notmuch_new_command (notmuch_database_t *notmuch, int argc, char *argv[])
     talloc_free (add_files_state.removed_directories);
     talloc_free (add_files_state.directory_mtimes);
 
+    if (new_tags_config)
+	talloc_free (new_tags_config);
+
     if (timer_is_active)
 	stop_progress_printing_timer ();
 
diff --git a/test/T050-new.sh b/test/T050-new.sh
index 1141c1e3..6b6f5071 100755
--- a/test/T050-new.sh
+++ b/test/T050-new.sh
@@ -37,6 +37,12 @@ test_begin_subtest "No new messages (full-scan)"
 output=$(NOTMUCH_NEW --debug --full-scan 2>&1)
 test_expect_equal "$output" "No new mail."
 
+test_begin_subtest "Single new message with custom tags"
+generate_message
+NOTMUCH_NEW --debug --new-tags="foo-new;bar-new"
+output=$(notmuch count tag:foo-new and tag:bar-new)
+test_expect_equal "$output" "1"
+
 test_begin_subtest "New directories"
 rm -rf "${MAIL_DIR}"/* "${MAIL_DIR}"/.notmuch
 mkdir "${MAIL_DIR}"/def
-- 
2.32.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC 0/2] add --new-tags= to notmuch-new(1)
  2021-09-16 10:25 [RFC 0/2] add --new-tags= to notmuch-new(1) andreas
  2021-09-16 10:25 ` [RFC 1/2] lib/config: introduce notmuch_config_values_from_string function andreas
  2021-09-16 10:25 ` [RFC 2/2] CLI/notmuch: add --new-tags argument to notmuch-new(1) andreas
@ 2021-09-16 12:03 ` Michael J Gruber
  2021-09-16 12:24   ` andreas
  2021-09-20 23:57 ` David Bremner
  3 siblings, 1 reply; 7+ messages in thread
From: Michael J Gruber @ 2021-09-16 12:03 UTC (permalink / raw)
  To: andreas, notmuch

andreas@rammhold.de venit, vidit, dixit 2021-09-16 12:25:15:
> I've recently became a little annoyed with a race condition in my current
> notmuch setup that originates from only having a single set of new tags for
> all notmuch-new(1) invocations. In the past I've mentioned this a couple of
> times in the IRC channel and now I got around to implement a basic version of
> this.
> 
> I rougly process new messages like this:
> 
>   1. new messages get the "new" tag through notmuch-new(1),
>   2. the post-new hook calls a series of scripts
...
> There are right now 8 mailboxes that I am retrieving mails from. I have a
> scheduled job that updates all my local Maildir's every couple of minutes.
> That one doesn't cause any issues on its own.
> 
> But I also use IMAP IDLE to selectively update Maildir's as soon as a new mail
> arrives.
> 
> If I receive mails on multiple Maildir's within a short period of time the
> above process is running into race-conditions since there is no way to
> distinguish mails that have just been marked new. All of them carry the same
> tags.
> 
> In the worst case one of the last steps (2c/2d) would pick up the new mail
> before any of the actual classification has been executed. This "leaks" mails
> into my inbox which consequently can be overwhelming to look at.
> 
> With this series I am able to give each notmuch-new(1) invocation a unique tag
> (think: new-$(uuidgen)). This doesn't (on its own) solve the entire story but
> is a first step in the "right direction" IMHO. I still have to wrap the entire

I very much sympathize with your setup. But I think the real solution
would be one of these options:

- use a lock file to prevent your scripts from running concurrently OR

- match "tag:new and folder:UNIQUEFOLDER" (or "path:FOO/**")
  This should be a perfect substitute for your new-UNIQUE tag.

As for the implementation you suggest: Basically, you implement
overriding the "new.tags" config, and I'm wondering whether it would be
worthwhile to implement "notmuch --config-value" instead:

--config-value=SECTION.ITEM=VALUE
	Override the config setting for SECTION.ITEM with the VALUE for
	this invocation. This takes precedence over any setting in the
	config file.

I know this raises the question which config any hooks will see, but
that is the same for your implementation.

You can override the complete config by specifying a different file
already, of course, so you could script that.

Cheers
Michael

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC 0/2] add --new-tags= to notmuch-new(1)
  2021-09-16 12:03 ` [RFC 0/2] add --new-tags= " Michael J Gruber
@ 2021-09-16 12:24   ` andreas
  2021-09-16 14:43     ` Michael J Gruber
  0 siblings, 1 reply; 7+ messages in thread
From: andreas @ 2021-09-16 12:24 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: notmuch

On 14:03 16.09.21, Michael J Gruber wrote:
> I very much sympathize with your setup. But I think the real solution
> would be one of these options:
> 
> - use a lock file to prevent your scripts from running concurrently OR

That would work. I'd love if that could be moved into notmuch such
that running `notmuch new` will keep a (global) lock until all of the
messages have been processed and all of the hooks have been executed.

This is probably tricky to get right consistently across the notmuch
code base. At least that was one of the comments I recall from a
conversation on IRC when that topic was brought up.

> - match "tag:new and folder:UNIQUEFOLDER" (or "path:FOO/**")
>   This should be a perfect substitute for your new-UNIQUE tag.

How would the files end up in the UNIQUEFOLDER? What I probably forgot
to mention is that I do not sync only on the Mailbox boundary. I have an
IAMP IDLE instance for "Inbox" on all of the accounts and then more for
other folders in those accounts. I could probably script that (wouldn't
take more than an evening I am sure) but requires a lot more additional
code.


> As for the implementation you suggest: Basically, you implement
> overriding the "new.tags" config, and I'm wondering whether it would be
> worthwhile to implement "notmuch --config-value" instead:
> 
> --config-value=SECTION.ITEM=VALUE
> 	Override the config setting for SECTION.ITEM with the VALUE for
> 	this invocation. This takes precedence over any setting in the
> 	config file.

That is an interesting idea. How would you implement that? Right now it
looks like all changes in configuration are written to disk when you
change them. That would require keeping a copy of all the settings in
RAM (which is probably already the case) and only mutating those without
writing to disk.


> I know this raises the question which config any hooks will see, but
> that is the same for your implementation.

Yeah, I was thinking about that as well. I did check if notmuch actually
passed anything via environment vars to the hooks. That doesn't seem to
be the case.

> You can override the complete config by specifying a different file
> already, of course, so you could script that.

Yes, that is also something I considered but I'd rather avoid that. I've
had some issues with defining the notmuch configuration in the past. For
some reason it always failed to find the file or load the correct one :(
Was probably some local issue - I hope.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC 0/2] add --new-tags= to notmuch-new(1)
  2021-09-16 12:24   ` andreas
@ 2021-09-16 14:43     ` Michael J Gruber
  0 siblings, 0 replies; 7+ messages in thread
From: Michael J Gruber @ 2021-09-16 14:43 UTC (permalink / raw)
  To: andreas; +Cc: notmuch

andreas@rammhold.de venit, vidit, dixit 2021-09-16 14:24:14:
> On 14:03 16.09.21, Michael J Gruber wrote:
> > I very much sympathize with your setup. But I think the real solution
> > would be one of these options:
> > 
> > - use a lock file to prevent your scripts from running concurrently OR
> 
> That would work. I'd love if that could be moved into notmuch such
> that running `notmuch new` will keep a (global) lock until all of the
> messages have been processed and all of the hooks have been executed.
> 
> This is probably tricky to get right consistently across the notmuch
> code base. At least that was one of the comments I recall from a
> conversation on IRC when that topic was brought up.

I don't think it's notmuch's job to do that. But if you call notmuch for
a specific folder you can instead call a wrapper script which keeps a
folder-specific lock until that notmuch returns, and can you pass the
folder to the hooks via environment.

> > - match "tag:new and folder:UNIQUEFOLDER" (or "path:FOO/**")
> >   This should be a perfect substitute for your new-UNIQUE tag.
> 
> How would the files end up in the UNIQUEFOLDER? What I probably forgot
> to mention is that I do not sync only on the Mailbox boundary. I have an
> IAMP IDLE instance for "Inbox" on all of the accounts and then more for
> other folders in those accounts. I could probably script that (wouldn't
> take more than an evening I am sure) but requires a lot more additional
> code.

I assumed that each of your different notmuch calls (from different IDLE
instances) act on different folders or sets of folders, so that you can
distinguish the messages either way.

Depending on your setup, there may still be issues when a second sync
interferes so that the "first" notmuch new tags messages from the
"second" sync. That's exactly what a lock can prevent if you do the sync
in a pre-new hook.

> 
> > As for the implementation you suggest: Basically, you implement
> > overriding the "new.tags" config, and I'm wondering whether it would be
> > worthwhile to implement "notmuch --config-value" instead:
> > 
> > --config-value=SECTION.ITEM=VALUE
> >       Override the config setting for SECTION.ITEM with the VALUE for
> >       this invocation. This takes precedence over any setting in the
> >       config file.
> 
> That is an interesting idea. How would you implement that? Right now it
> looks like all changes in configuration are written to disk when you
> change them. That would require keeping a copy of all the settings in
> RAM (which is probably already the case) and only mutating those without
> writing to disk.

... and notmuch config get called from a hook should return that value,
ideally (I don't know how).

We already have at least these:

notmuch {insert,new,reindex,reply,show} --decrypt overrides index.decrypt
NOTMUCH_DATABASE overrides database.path

and the env vars for config and profile interacting with --config. So I
was hoping for a systematic solution. But I just love "git -c" too much :)

Michael

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC 0/2] add --new-tags= to notmuch-new(1)
  2021-09-16 10:25 [RFC 0/2] add --new-tags= to notmuch-new(1) andreas
                   ` (2 preceding siblings ...)
  2021-09-16 12:03 ` [RFC 0/2] add --new-tags= " Michael J Gruber
@ 2021-09-20 23:57 ` David Bremner
  3 siblings, 0 replies; 7+ messages in thread
From: David Bremner @ 2021-09-20 23:57 UTC (permalink / raw)
  To: andreas, notmuch

andreas@rammhold.de writes:

> I am posting this as an RFC to see what other users and the developers think
> about this approach and if anyone has solved a similar issue (in a different
> way).

I honestly don't know if what you propose is the best solution for your
problem. Looking at the functionality for it's own sake, I wonder why
you don't copy the UI from notmuch insert, which already allows
specifying arbitrary tag operations for newly delivered messages. I
suspect some of the code can re-used as well.

d

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-09-20 23:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-16 10:25 [RFC 0/2] add --new-tags= to notmuch-new(1) andreas
2021-09-16 10:25 ` [RFC 1/2] lib/config: introduce notmuch_config_values_from_string function andreas
2021-09-16 10:25 ` [RFC 2/2] CLI/notmuch: add --new-tags argument to notmuch-new(1) andreas
2021-09-16 12:03 ` [RFC 0/2] add --new-tags= " Michael J Gruber
2021-09-16 12:24   ` andreas
2021-09-16 14:43     ` Michael J Gruber
2021-09-20 23:57 ` David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).