unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH 0/5] cli: alternative address deduplication
@ 2015-08-29 14:56 Jani Nikula
  2015-08-29 14:56 ` [RFC PATCH 1/5] cli: g_hash_table_lookup_extended is overkill Jani Nikula
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Jani Nikula @ 2015-08-29 14:56 UTC (permalink / raw)
  To: notmuch

This series adds support both for not deduplicating addresses in notmuch
address, and for using just the case insensitive address part of an email
address. The new deduplication picks the most popular variant.

Real life examples, current deduplication:

$ notmuch address --output=count from:amdragon
1544	Austin Clements <amdragon@MIT.EDU>
63	Austin Clements <amdragon@mit.edu>

$ notmuch address --output=count from:jani@nikula.org
2344	Jani Nikula <jani@nikula.org>
3	Jani <jani@nikula.org>
5	jani@nikula.org
6	"jani@nikula.org" <jani@nikula.org>

And new deduplication:

$ notmuch address --output=count --deduplicate=address from:amdragon
1607	Austin Clements <amdragon@MIT.EDU>

$ notmuch address --output=count --deduplicate=address from:jani@nikula.org
2358	Jani Nikula <jani@nikula.org>

BR,
Jani.


Jani Nikula (5):
  cli: g_hash_table_lookup_extended is overkill
  cli: abstract new mailbox creation
  cli: add support for not deduplicating notmuch address results
  cli: change the data structure for notmuch address deduplication
  cli: add support for deduplicating based on case insensitive address

 notmuch-client.h |   1 +
 notmuch-search.c | 149 +++++++++++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 130 insertions(+), 20 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC PATCH 1/5] cli: g_hash_table_lookup_extended is overkill
  2015-08-29 14:56 [PATCH 0/5] cli: alternative address deduplication Jani Nikula
@ 2015-08-29 14:56 ` Jani Nikula
  2015-08-29 14:56 ` [RFC PATCH 2/5] cli: abstract new mailbox creation Jani Nikula
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Jani Nikula @ 2015-08-29 14:56 UTC (permalink / raw)
  To: notmuch

Switch to normal glib hash table lookup. The extended version is only
required if the values may contain NULL.
---
 notmuch-search.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/notmuch-search.c b/notmuch-search.c
index 3076c3f637b1..7fdc6acaa2fe 100644
--- a/notmuch-search.c
+++ b/notmuch-search.c
@@ -248,7 +248,6 @@ do_search_threads (search_context_t *ctx)
 static notmuch_bool_t
 is_duplicate (const search_context_t *ctx, const char *name, const char *addr)
 {
-    notmuch_bool_t duplicate;
     char *key;
     mailbox_t *mailbox;
 
@@ -256,20 +255,20 @@ is_duplicate (const search_context_t *ctx, const char *name, const char *addr)
     if (! key)
 	return FALSE;
 
-    duplicate = g_hash_table_lookup_extended (ctx->addresses, key, NULL, (gpointer)&mailbox);
-
-    if (! duplicate) {
-	mailbox = talloc (ctx->format, mailbox_t);
-	mailbox->name = talloc_strdup (mailbox, name);
-	mailbox->addr = talloc_strdup (mailbox, addr);
-	mailbox->count = 1;
-	g_hash_table_insert (ctx->addresses, key, mailbox);
-    } else {
+    mailbox = g_hash_table_lookup (ctx->addresses, key);
+    if (mailbox) {
 	mailbox->count++;
 	talloc_free (key);
+	return TRUE;
     }
 
-    return duplicate;
+    mailbox = talloc (ctx->format, mailbox_t);
+    mailbox->name = talloc_strdup (mailbox, name);
+    mailbox->addr = talloc_strdup (mailbox, addr);
+    mailbox->count = 1;
+    g_hash_table_insert (ctx->addresses, key, mailbox);
+
+    return FALSE;
 }
 
 static void
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 2/5] cli: abstract new mailbox creation
  2015-08-29 14:56 [PATCH 0/5] cli: alternative address deduplication Jani Nikula
  2015-08-29 14:56 ` [RFC PATCH 1/5] cli: g_hash_table_lookup_extended is overkill Jani Nikula
@ 2015-08-29 14:56 ` Jani Nikula
  2015-08-29 14:56 ` [RFC PATCH 3/5] cli: add support for not deduplicating notmuch address results Jani Nikula
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Jani Nikula @ 2015-08-29 14:56 UTC (permalink / raw)
  To: notmuch

We'll be needing more mailbox creation soon, so abstract it
away. While at it, check for allocation failures. No other functional
changes.
---
 notmuch-search.c | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/notmuch-search.c b/notmuch-search.c
index 7fdc6acaa2fe..36f58eb8d54c 100644
--- a/notmuch-search.c
+++ b/notmuch-search.c
@@ -243,6 +243,21 @@ do_search_threads (search_context_t *ctx)
     return 0;
 }
 
+static mailbox_t *new_mailbox (void *ctx, const char *name, const char *addr)
+{
+    mailbox_t *mailbox;
+
+    mailbox = talloc (ctx, mailbox_t);
+    if (! mailbox)
+	return NULL;
+
+    mailbox->name = talloc_strdup (mailbox, name);
+    mailbox->addr = talloc_strdup (mailbox, addr);
+    mailbox->count = 1;
+
+    return mailbox;
+}
+
 /* Returns TRUE iff name and addr is duplicate. If not, stores the
  * name/addr pair in order to detect subsequent duplicates. */
 static notmuch_bool_t
@@ -262,10 +277,10 @@ is_duplicate (const search_context_t *ctx, const char *name, const char *addr)
 	return TRUE;
     }
 
-    mailbox = talloc (ctx->format, mailbox_t);
-    mailbox->name = talloc_strdup (mailbox, name);
-    mailbox->addr = talloc_strdup (mailbox, addr);
-    mailbox->count = 1;
+    mailbox = new_mailbox (ctx->format, name, addr);
+    if (! mailbox)
+	return FALSE;
+
     g_hash_table_insert (ctx->addresses, key, mailbox);
 
     return FALSE;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 3/5] cli: add support for not deduplicating notmuch address results
  2015-08-29 14:56 [PATCH 0/5] cli: alternative address deduplication Jani Nikula
  2015-08-29 14:56 ` [RFC PATCH 1/5] cli: g_hash_table_lookup_extended is overkill Jani Nikula
  2015-08-29 14:56 ` [RFC PATCH 2/5] cli: abstract new mailbox creation Jani Nikula
@ 2015-08-29 14:56 ` Jani Nikula
  2015-08-30  1:29   ` David Bremner
  2015-08-29 14:56 ` [RFC PATCH 4/5] cli: change the data structure for notmuch address deduplication Jani Nikula
  2015-08-29 14:56 ` [RFC PATCH 5/5] cli: add support for deduplicating based on case insensitive address Jani Nikula
  4 siblings, 1 reply; 15+ messages in thread
From: Jani Nikula @ 2015-08-29 14:56 UTC (permalink / raw)
  To: notmuch

Make it possible to use notmuch address as part of a | sort | uniq -c
pipe instead of forcing --output=count. This is useful for combining
results from multiple notmuch address queries.
---
 notmuch-search.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/notmuch-search.c b/notmuch-search.c
index 36f58eb8d54c..be8afcc0187b 100644
--- a/notmuch-search.c
+++ b/notmuch-search.c
@@ -355,7 +355,8 @@ process_address_list (const search_context_t *ctx,
 		.count = 0,
 	    };
 
-	    if (is_duplicate (ctx, mbx.name, mbx.addr))
+	    if ((ctx->output & OUTPUT_COUNT || ctx->dupe) &&
+		is_duplicate (ctx, mbx.name, mbx.addr))
 		continue;
 
 	    if (ctx->output & OUTPUT_COUNT)
@@ -755,6 +756,10 @@ notmuch_address_command (notmuch_config_t *config, int argc, char *argv[])
 	  (notmuch_keyword_t []){ { "true", NOTMUCH_EXCLUDE_TRUE },
 				  { "false", NOTMUCH_EXCLUDE_FALSE },
 				  { 0, 0 } } },
+	{ NOTMUCH_OPT_KEYWORD, &ctx->dupe, "deduplicate", 'x',
+	  (notmuch_keyword_t []){ { "yes", -1 },
+				  { "no", 0 },
+				  { 0, 0 } } },
 	{ NOTMUCH_OPT_INHERIT, (void *) &common_options, NULL, 0, 0 },
 	{ NOTMUCH_OPT_INHERIT, (void *) &notmuch_shared_options, NULL, 0, 0 },
 	{ 0, 0, 0, 0, 0 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 4/5] cli: change the data structure for notmuch address deduplication
  2015-08-29 14:56 [PATCH 0/5] cli: alternative address deduplication Jani Nikula
                   ` (2 preceding siblings ...)
  2015-08-29 14:56 ` [RFC PATCH 3/5] cli: add support for not deduplicating notmuch address results Jani Nikula
@ 2015-08-29 14:56 ` Jani Nikula
  2015-08-30  1:48   ` David Bremner
  2015-08-29 14:56 ` [RFC PATCH 5/5] cli: add support for deduplicating based on case insensitive address Jani Nikula
  4 siblings, 1 reply; 15+ messages in thread
From: Jani Nikula @ 2015-08-29 14:56 UTC (permalink / raw)
  To: notmuch

Currently we key the address hash table with the case sensitive "name
<address>". Switch to case insensitive keying with just address, and
store the case sensitive name and address in linked lists. This will
be helpful in adding support different deduplication schemes in the
future. There will be a slight performance penalty for the current
full case sensitive name + address deduplication, but this is simpler
as a whole when other deduplication schemes are added, and I expect
the schemes to be added to become more popular than the current
default.
---
 notmuch-client.h |  1 +
 notmuch-search.c | 91 ++++++++++++++++++++++++++++++++++++++++++++++----------
 2 files changed, 76 insertions(+), 16 deletions(-)

diff --git a/notmuch-client.h b/notmuch-client.h
index 882aa30563df..97d68d1158ac 100644
--- a/notmuch-client.h
+++ b/notmuch-client.h
@@ -48,6 +48,7 @@ typedef GMimeCryptoContext notmuch_crypto_context_t;
 #include <dirent.h>
 #include <errno.h>
 #include <signal.h>
+#include <ctype.h>
 
 #include "talloc-extra.h"
 
diff --git a/notmuch-search.c b/notmuch-search.c
index be8afcc0187b..60311393198d 100644
--- a/notmuch-search.c
+++ b/notmuch-search.c
@@ -258,30 +258,79 @@ static mailbox_t *new_mailbox (void *ctx, const char *name, const char *addr)
     return mailbox;
 }
 
+static int
+strcase_equal (const void *a, const void *b)
+{
+    return strcasecmp (a, b) == 0;
+}
+
+static unsigned int
+strcase_hash (const void *ptr)
+{
+    const char *s = ptr;
+
+    /* This is the djb2 hash. */
+    unsigned int hash = 5381;
+    while (s && *s) {
+	hash = ((hash << 5) + hash) + tolower (*s);
+	s++;
+    }
+
+    return hash;
+}
+
+static int mailbox_compare (const void *v1, const void *v2)
+{
+    const mailbox_t *m1 = v1, *m2 = v2;
+    int v;
+
+    if (m1->name && m2->name)
+	v = strcmp (m1->name, m2->name);
+    else
+	v = !!m1->name - !!m2->name;
+
+    if (! v)
+	v = strcmp (m1->addr, m2->addr);
+
+    return v;
+}
+
 /* Returns TRUE iff name and addr is duplicate. If not, stores the
  * name/addr pair in order to detect subsequent duplicates. */
 static notmuch_bool_t
 is_duplicate (const search_context_t *ctx, const char *name, const char *addr)
 {
     char *key;
+    GList *list, *l;
     mailbox_t *mailbox;
 
-    key = talloc_asprintf (ctx->format, "%s <%s>", name, addr);
-    if (! key)
+    mailbox = new_mailbox (ctx->format, name, addr);
+    if (! mailbox)
 	return FALSE;
 
-    mailbox = g_hash_table_lookup (ctx->addresses, key);
-    if (mailbox) {
-	mailbox->count++;
-	talloc_free (key);
-	return TRUE;
+    list = g_hash_table_lookup (ctx->addresses, addr);
+    if (list) {
+	l = g_list_find_custom (list, mailbox, mailbox_compare);
+	if (l) {
+	    talloc_free (mailbox);
+	    mailbox = l->data;
+	    mailbox->count++;
+	    return TRUE;
+	}
+
+	g_list_append (list, mailbox);
+	return FALSE;
     }
 
-    mailbox = new_mailbox (ctx->format, name, addr);
-    if (! mailbox)
+    key = talloc_strdup (ctx->format, addr);
+    if (! key)
 	return FALSE;
 
-    g_hash_table_insert (ctx->addresses, key, mailbox);
+    list = g_list_append (NULL, mailbox);
+    if (! list)
+	return FALSE;
+
+    g_hash_table_insert (ctx->addresses, key, list);
 
     return FALSE;
 }
@@ -393,12 +442,21 @@ _talloc_free_for_g_hash (void *ptr)
 }
 
 static void
-print_hash_value (unused (gpointer key), gpointer value, gpointer user_data)
+_list_free_for_g_hash (void *ptr)
+{
+    g_list_free_full (ptr, _talloc_free_for_g_hash);
+}
+
+static void
+print_list_value (void *mailbox, void *context)
 {
-    const mailbox_t *mailbox = value;
-    search_context_t *ctx = user_data;
+    print_mailbox (context, mailbox);
+}
 
-    print_mailbox (ctx, mailbox);
+static void
+print_hash_value (unused (void *key), void *list, void *context)
+{
+    g_list_foreach (list, print_list_value, context);
 }
 
 static int
@@ -778,8 +836,9 @@ notmuch_address_command (notmuch_config_t *config, int argc, char *argv[])
 				 argc - opt_index, argv + opt_index))
 	return EXIT_FAILURE;
 
-    ctx->addresses = g_hash_table_new_full (g_str_hash, g_str_equal,
-					    _talloc_free_for_g_hash, _talloc_free_for_g_hash);
+    ctx->addresses = g_hash_table_new_full (strcase_hash, strcase_equal,
+					    _talloc_free_for_g_hash,
+					    _list_free_for_g_hash);
 
     ret = do_search_messages (ctx);
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 5/5] cli: add support for deduplicating based on case insensitive address
  2015-08-29 14:56 [PATCH 0/5] cli: alternative address deduplication Jani Nikula
                   ` (3 preceding siblings ...)
  2015-08-29 14:56 ` [RFC PATCH 4/5] cli: change the data structure for notmuch address deduplication Jani Nikula
@ 2015-08-29 14:56 ` Jani Nikula
  2015-08-30 12:06   ` David Bremner
  4 siblings, 1 reply; 15+ messages in thread
From: Jani Nikula @ 2015-08-29 14:56 UTC (permalink / raw)
  To: notmuch

Consider all variants of an email address as one, and print the most
common variant.
---
 notmuch-search.c | 33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/notmuch-search.c b/notmuch-search.c
index 60311393198d..537298788ab9 100644
--- a/notmuch-search.c
+++ b/notmuch-search.c
@@ -447,6 +447,31 @@ _list_free_for_g_hash (void *ptr)
     g_list_free_full (ptr, _talloc_free_for_g_hash);
 }
 
+/* Print the most common variant of a list of unique mailboxes, and
+ * conflate the counts. */
+static void
+print_popular (const search_context_t *ctx, GList *list)
+{
+    GList *l;
+    mailbox_t *mailbox = NULL, *m;
+    int max = 0;
+    int total = 0;
+
+    for (l = list; l; l = l->next) {
+	m = l->data;
+	total += m->count;
+	if (m->count > max) {
+	    mailbox = m;
+	    max = m->count;
+	}
+    }
+
+    /* The original count is no longer needed, so overwrite. */
+    mailbox->count = total;
+
+    print_mailbox (ctx, mailbox);
+}
+
 static void
 print_list_value (void *mailbox, void *context)
 {
@@ -456,7 +481,12 @@ print_list_value (void *mailbox, void *context)
 static void
 print_hash_value (unused (void *key), void *list, void *context)
 {
-    g_list_foreach (list, print_list_value, context);
+    const search_context_t *ctx = context;
+
+    if (ctx->dupe == 1)
+	print_popular (ctx, list);
+    else
+	g_list_foreach (list, print_list_value, context);
 }
 
 static int
@@ -817,6 +847,7 @@ notmuch_address_command (notmuch_config_t *config, int argc, char *argv[])
 	{ NOTMUCH_OPT_KEYWORD, &ctx->dupe, "deduplicate", 'x',
 	  (notmuch_keyword_t []){ { "yes", -1 },
 				  { "no", 0 },
+				  { "address", 1 },
 				  { 0, 0 } } },
 	{ NOTMUCH_OPT_INHERIT, (void *) &common_options, NULL, 0, 0 },
 	{ NOTMUCH_OPT_INHERIT, (void *) &notmuch_shared_options, NULL, 0, 0 },
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 3/5] cli: add support for not deduplicating notmuch address results
  2015-08-29 14:56 ` [RFC PATCH 3/5] cli: add support for not deduplicating notmuch address results Jani Nikula
@ 2015-08-30  1:29   ` David Bremner
  2015-08-30  7:33     ` Jani Nikula
  0 siblings, 1 reply; 15+ messages in thread
From: David Bremner @ 2015-08-30  1:29 UTC (permalink / raw)
  To: Jani Nikula, notmuch

Jani Nikula <jani@nikula.org> writes:


> +	{ NOTMUCH_OPT_KEYWORD, &ctx->dupe, "deduplicate", 'x',

probably you want 'D' or 'd' here. Not that it makes a practical
difference at this point.

> +	  (notmuch_keyword_t []){ { "yes", -1 },

I'm not very enthusiastic about reusing ctx->dupe for this.
In particular the use of -1 for yes is off-putting
It seems better to allocate another word of memory and use an
enum, as elsewhere.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 4/5] cli: change the data structure for notmuch address deduplication
  2015-08-29 14:56 ` [RFC PATCH 4/5] cli: change the data structure for notmuch address deduplication Jani Nikula
@ 2015-08-30  1:48   ` David Bremner
  2015-08-30  7:45     ` Jani Nikula
  2015-08-30  7:47     ` Jani Nikula
  0 siblings, 2 replies; 15+ messages in thread
From: David Bremner @ 2015-08-30  1:48 UTC (permalink / raw)
  To: Jani Nikula, notmuch

Jani Nikula <jani@nikula.org> writes:

>  
> +static int
> +strcase_equal (const void *a, const void *b)
> +{
> +    return strcasecmp (a, b) == 0;
> +}
> +
> +static unsigned int
> +strcase_hash (const void *ptr)
> +{
> +    const char *s = ptr;
> +
> +    /* This is the djb2 hash. */
> +    unsigned int hash = 5381;
> +    while (s && *s) {
> +	hash = ((hash << 5) + hash) + tolower (*s);
> +	s++;
> +    }
> +
> +    return hash;
> +}
> +

as discussed, these functions probably need to be factored out into
libutil.

> +	l = g_list_find_custom (list, mailbox, mailbox_compare);
> +	if (l) {
> +	    talloc_free (mailbox);
> +	    mailbox = l->data;
> +	    mailbox->count++;
> +	    return TRUE;
> +	}

I found this use of mailbox as a temporary variable confusing; despite
the obvious return I thought it might have something to do with the
g_list_append below.  Maybe just make a block scope temporary variable?

> +
> +	g_list_append (list, mailbox);
> +	return FALSE;
>      }
>  
> -    mailbox = new_mailbox (ctx->format, name, addr);
> -    if (! mailbox)
> +    key = talloc_strdup (ctx->format, addr);
> +    if (! key)
>  	return FALSE;

I guess this doesn't make the error handling worse; both old and new
code silently ignore OOM if I understand correctly. Do you happen to
understand the original choice of using ctx->format rather than that
ctx->notmuch for a talloc parent? it doesn't seem to get deallocated any
earlier.

d

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 3/5] cli: add support for not deduplicating notmuch address results
  2015-08-30  1:29   ` David Bremner
@ 2015-08-30  7:33     ` Jani Nikula
  0 siblings, 0 replies; 15+ messages in thread
From: Jani Nikula @ 2015-08-30  7:33 UTC (permalink / raw)
  To: David Bremner, notmuch

On Sun, 30 Aug 2015, David Bremner <david@tethera.net> wrote:
> Jani Nikula <jani@nikula.org> writes:
>
>
>> +	{ NOTMUCH_OPT_KEYWORD, &ctx->dupe, "deduplicate", 'x',
>
> probably you want 'D' or 'd' here. Not that it makes a practical
> difference at this point.
>
>> +	  (notmuch_keyword_t []){ { "yes", -1 },
>
> I'm not very enthusiastic about reusing ctx->dupe for this.
> In particular the use of -1 for yes is off-putting
> It seems better to allocate another word of memory and use an
> enum, as elsewhere.

Agreed on both.

BR,
Jani.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 4/5] cli: change the data structure for notmuch address deduplication
  2015-08-30  1:48   ` David Bremner
@ 2015-08-30  7:45     ` Jani Nikula
  2015-08-30 11:46       ` David Bremner
  2015-08-30  7:47     ` Jani Nikula
  1 sibling, 1 reply; 15+ messages in thread
From: Jani Nikula @ 2015-08-30  7:45 UTC (permalink / raw)
  To: David Bremner, notmuch

On Sun, 30 Aug 2015, David Bremner <david@tethera.net> wrote:
> Jani Nikula <jani@nikula.org> writes:
>
>>  
>> +static int
>> +strcase_equal (const void *a, const void *b)
>> +{
>> +    return strcasecmp (a, b) == 0;
>> +}
>> +
>> +static unsigned int
>> +strcase_hash (const void *ptr)
>> +{
>> +    const char *s = ptr;
>> +
>> +    /* This is the djb2 hash. */
>> +    unsigned int hash = 5381;
>> +    while (s && *s) {
>> +	hash = ((hash << 5) + hash) + tolower (*s);
>> +	s++;
>> +    }
>> +
>> +    return hash;
>> +}
>> +
>
> as discussed, these functions probably need to be factored out into
> libutil.

Ack.

>
>> +	l = g_list_find_custom (list, mailbox, mailbox_compare);
>> +	if (l) {
>> +	    talloc_free (mailbox);
>> +	    mailbox = l->data;
>> +	    mailbox->count++;
>> +	    return TRUE;
>> +	}
>
> I found this use of mailbox as a temporary variable confusing; despite
> the obvious return I thought it might have something to do with the
> g_list_append below.  Maybe just make a block scope temporary variable?

This is how the function would turn out with that. Better, I guess? I
also tried to think of ways to combine the two g_list_append paths here,
but in the end doing it like this has most clarity I think.

static notmuch_bool_t
is_duplicate (const search_context_t *ctx, const char *name, const char *addr)
{
    char *key;
    GList *list, *l;
    mailbox_t *mailbox;

    list = g_hash_table_lookup (ctx->addresses, addr);
    if (list) {
	mailbox_t find = {
	    .name = name,
	    .addr = addr,
	};

	l = g_list_find_custom (list, &find, mailbox_compare);
	if (l) {
	    mailbox = l->data;
	    mailbox->count++;
	    return TRUE;
	}

	mailbox = new_mailbox (ctx->format, name, addr);
	if (! mailbox)
	    return FALSE;

	g_list_append (list, mailbox);
	return FALSE;
    }

    key = talloc_strdup (ctx->format, addr);
    if (! key)
	return FALSE;

    mailbox = new_mailbox (ctx->format, name, addr);
    if (! mailbox)
	return FALSE;

    list = g_list_append (NULL, mailbox);
    if (! list)
	return FALSE;

    g_hash_table_insert (ctx->addresses, key, list);

    return FALSE;
}

>> +
>> +	g_list_append (list, mailbox);
>> +	return FALSE;
>>      }
>>  
>> -    mailbox = new_mailbox (ctx->format, name, addr);
>> -    if (! mailbox)
>> +    key = talloc_strdup (ctx->format, addr);
>> +    if (! key)
>>  	return FALSE;
>
> I guess this doesn't make the error handling worse; both old and new
> code silently ignore OOM if I understand correctly. Do you happen to
> understand the original choice of using ctx->format rather than that
> ctx->notmuch for a talloc parent? it doesn't seem to get deallocated any
> earlier.

I don't know or understand that part of the history. It doesn't really
matter though because the deallocation is explicitly done on all keys
and values via g_hash_table_unref.

BR,
Jani.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 4/5] cli: change the data structure for notmuch address deduplication
  2015-08-30  1:48   ` David Bremner
  2015-08-30  7:45     ` Jani Nikula
@ 2015-08-30  7:47     ` Jani Nikula
  1 sibling, 0 replies; 15+ messages in thread
From: Jani Nikula @ 2015-08-30  7:47 UTC (permalink / raw)
  To: David Bremner, notmuch

On Sun, 30 Aug 2015, David Bremner <david@tethera.net> wrote:
> I guess this doesn't make the error handling worse; both old and new
> code silently ignore OOM if I understand correctly.

Oh, and current git will not silently ignore OOM. It will segfault... ;)

BR,
Jani.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 4/5] cli: change the data structure for notmuch address deduplication
  2015-08-30  7:45     ` Jani Nikula
@ 2015-08-30 11:46       ` David Bremner
  0 siblings, 0 replies; 15+ messages in thread
From: David Bremner @ 2015-08-30 11:46 UTC (permalink / raw)
  To: Jani Nikula, notmuch

Jani Nikula <jani@nikula.org> writes:

>> I found this use of mailbox as a temporary variable confusing; despite
>> the obvious return I thought it might have something to do with the
>> g_list_append below.  Maybe just make a block scope temporary variable?
>
> This is how the function would turn out with that. Better, I guess? I
> also tried to think of ways to combine the two g_list_append paths here,
> but in the end doing it like this has most clarity I think.
>

Your (new) version is fine for me. As it happens I was thinking of a
smaller tweak:


	l = g_list_find_custom (list, mailbox, mailbox_compare);
	if (l) {
            mailbox_t *found;
            
	    talloc_free (mailbox);

            found = l->data;
            found->count++;
            
	    return TRUE;
	}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 5/5] cli: add support for deduplicating based on case insensitive address
  2015-08-29 14:56 ` [RFC PATCH 5/5] cli: add support for deduplicating based on case insensitive address Jani Nikula
@ 2015-08-30 12:06   ` David Bremner
  2015-08-31 10:52     ` Tomi Ollila
  0 siblings, 1 reply; 15+ messages in thread
From: David Bremner @ 2015-08-30 12:06 UTC (permalink / raw)
  To: Jani Nikula, notmuch

Jani Nikula <jani@nikula.org> writes:

> Consider all variants of an email address as one, and print the most
> common variant.

Other than the quibbles already mentioned, the series looks ok to
me. For production it should have one or two tests I guess. Oh, and man
page updates. But you knew that I guess.

I can live with the current argument syntax, but since a certain a mount
of bikeshedding is expected, here is an alternative suggestion

--deduplication={none|mailbox|address}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 5/5] cli: add support for deduplicating based on case insensitive address
  2015-08-30 12:06   ` David Bremner
@ 2015-08-31 10:52     ` Tomi Ollila
  2015-08-31 11:15       ` David Bremner
  0 siblings, 1 reply; 15+ messages in thread
From: Tomi Ollila @ 2015-08-31 10:52 UTC (permalink / raw)
  To: David Bremner, Jani Nikula, notmuch

On Sun, Aug 30 2015, David Bremner <david@tethera.net> wrote:

> Jani Nikula <jani@nikula.org> writes:
>
>> Consider all variants of an email address as one, and print the most
>> common variant.
>
> Other than the quibbles already mentioned, the series looks ok to
> me. For production it should have one or two tests I guess. Oh, and man
> page updates. But you knew that I guess.
>
> I can live with the current argument syntax, but since a certain a mount
> of bikeshedding is expected, here is an alternative suggestion
>
> --deduplication={none|mailbox|address}

(is s/deduplicate/deduplication/ intended or accidental change?)

Is this complete alternative to --deduplicate={no|yes|address}, respectively?

If it is it looks somewhat better to me (provided that it is accurate)
anyway, the user command line interface looks good to me.



Tomi

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 5/5] cli: add support for deduplicating based on case insensitive address
  2015-08-31 10:52     ` Tomi Ollila
@ 2015-08-31 11:15       ` David Bremner
  0 siblings, 0 replies; 15+ messages in thread
From: David Bremner @ 2015-08-31 11:15 UTC (permalink / raw)
  To: Tomi Ollila, Jani Nikula, notmuch

Tomi Ollila <tomi.ollila@iki.fi> writes:

>> I can live with the current argument syntax, but since a certain a mount
>> of bikeshedding is expected, here is an alternative suggestion
>>
>> --deduplication={none|mailbox|address}
>
> (is s/deduplicate/deduplication/ intended or accidental change?)

accidental. hmm. well,

--deduplicate={no|mailbox|address}

is a few characters shorter.

> Is this complete alternative to --deduplicate={no|yes|address},
> respectively?

Yes

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-08-31 11:18 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-29 14:56 [PATCH 0/5] cli: alternative address deduplication Jani Nikula
2015-08-29 14:56 ` [RFC PATCH 1/5] cli: g_hash_table_lookup_extended is overkill Jani Nikula
2015-08-29 14:56 ` [RFC PATCH 2/5] cli: abstract new mailbox creation Jani Nikula
2015-08-29 14:56 ` [RFC PATCH 3/5] cli: add support for not deduplicating notmuch address results Jani Nikula
2015-08-30  1:29   ` David Bremner
2015-08-30  7:33     ` Jani Nikula
2015-08-29 14:56 ` [RFC PATCH 4/5] cli: change the data structure for notmuch address deduplication Jani Nikula
2015-08-30  1:48   ` David Bremner
2015-08-30  7:45     ` Jani Nikula
2015-08-30 11:46       ` David Bremner
2015-08-30  7:47     ` Jani Nikula
2015-08-29 14:56 ` [RFC PATCH 5/5] cli: add support for deduplicating based on case insensitive address Jani Nikula
2015-08-30 12:06   ` David Bremner
2015-08-31 10:52     ` Tomi Ollila
2015-08-31 11:15       ` David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).