unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH v5 0/7] notmuch search --output=sender/recipients
@ 2014-10-30 23:59 Michal Sojka
  2014-10-30 23:59 ` [PATCH v5 1/7] cli: search: Refactor passing of command line options Michal Sojka
                   ` (8 more replies)
  0 siblings, 9 replies; 16+ messages in thread
From: Michal Sojka @ 2014-10-30 23:59 UTC (permalink / raw)
  To: notmuch

Hi all,

this is v5 of the search --output=address series. It obsoletes v4
(id:1414421455-3037-1-git-send-email-sojkam1@fel.cvut.cz).

I addresses comments from Mark and Tomi. Based on the comments to v4
and earlier versions, patches 1-4 should be ready for merging. Patch 5
is a non-controversial part of the controversial --filter-by patch and
could be probably merged after review.

Patch 6 needs at least a review and patch 7 needs more discussion.

Changes from v4:

- patch changed to commit in commit messages
- opt->format changed to format
- Added comments to process_* functions
- duplicite changed to duplicate
- check_duplicate changed to is_duplicate
- Deduplication was split into two commits: basic deduplication
  without a command line option and configurable deduplication with
  --fiter-by.

Changes from v3:

- `o' renamed to `opt'.
- Conversion of --output from keyword to keyword-flags is now a
  separate patch.
- Structured output formats print name and address separately.
- Added test for --format=json.
- Changed --filter-by default to nameaddr. In v2, the default was
  addrfold, in v3 the default was no filtering at all. I believe that
  Mark's suggestion to make nameaddr the default is good trade off.
- Added new --output=count
- Minor style fixes
- Few typos fixed
- There is no way to output unfiltered (duplicite) addresses.
  Hopefully, the introduction of --output=count is sufficient
  replacement for this "feature".

Cheers,
-Michal


Jani Nikula (1):
  cli: Add support for parsing keyword-flag arguments

Michal Sojka (6):
  cli: search: Refactor passing of command line options
  cli: search: Convert --output to keyword-flag argument
  cli: search: Add --output={sender,recipients}
  cli: search: Do not output duplicate addresses
  cli: search: Add --output=count
  cli: search: Add --filter-by option to configure address filtering

 command-line-arguments.c           |   6 +-
 command-line-arguments.h           |   1 +
 completion/notmuch-completion.bash |   8 +-
 completion/notmuch-completion.zsh  |   4 +-
 doc/man1/notmuch-search.rst        |  66 ++++++-
 notmuch-search.c                   | 388 +++++++++++++++++++++++++++++--------
 test/T090-search-output.sh         | 137 +++++++++++++
 test/T095-search-filter-by.sh      |  64 ++++++
 test/T410-argument-parsing.sh      |   3 +-
 test/arg-test.c                    |   9 +
 10 files changed, 604 insertions(+), 82 deletions(-)
 create mode 100755 test/T095-search-filter-by.sh

-- 
2.1.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v5 1/7] cli: search: Refactor passing of command line options
  2014-10-30 23:59 [PATCH v5 0/7] notmuch search --output=sender/recipients Michal Sojka
@ 2014-10-30 23:59 ` Michal Sojka
  2014-10-30 23:59 ` [PATCH v5 2/7] cli: Add support for parsing keyword-flag arguments Michal Sojka
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Michal Sojka @ 2014-10-30 23:59 UTC (permalink / raw)
  To: notmuch

Many functions that implement the search command need to access command
line options. Instead of passing each option in a separate variable, put
them in a structure and pass only this structure.

This will become handy in the following commits.
---
 notmuch-search.c | 125 ++++++++++++++++++++++++++++---------------------------
 1 file changed, 64 insertions(+), 61 deletions(-)

diff --git a/notmuch-search.c b/notmuch-search.c
index bc9be45..0c3e972 100644
--- a/notmuch-search.c
+++ b/notmuch-search.c
@@ -30,6 +30,16 @@ typedef enum {
     OUTPUT_TAGS
 } output_t;
 
+typedef struct {
+    sprinter_t *format;
+    notmuch_query_t *query;
+    notmuch_sort_t sort;
+    output_t output;
+    int offset;
+    int limit;
+    int dupe;
+} search_options_t;
+
 /* Return two stable query strings that identify exactly the matched
  * and unmatched messages currently in thread.  If there are no
  * matched or unmatched messages, the returned buffers will be
@@ -70,43 +80,39 @@ get_thread_query (notmuch_thread_t *thread,
 }
 
 static int
-do_search_threads (sprinter_t *format,
-		   notmuch_query_t *query,
-		   notmuch_sort_t sort,
-		   output_t output,
-		   int offset,
-		   int limit)
+do_search_threads (search_options_t *opt)
 {
     notmuch_thread_t *thread;
     notmuch_threads_t *threads;
     notmuch_tags_t *tags;
+    sprinter_t *format = opt->format;
     time_t date;
     int i;
 
-    if (offset < 0) {
-	offset += notmuch_query_count_threads (query);
-	if (offset < 0)
-	    offset = 0;
+    if (opt->offset < 0) {
+	opt->offset += notmuch_query_count_threads (opt->query);
+	if (opt->offset < 0)
+	    opt->offset = 0;
     }
 
-    threads = notmuch_query_search_threads (query);
+    threads = notmuch_query_search_threads (opt->query);
     if (threads == NULL)
 	return 1;
 
     format->begin_list (format);
 
     for (i = 0;
-	 notmuch_threads_valid (threads) && (limit < 0 || i < offset + limit);
+	 notmuch_threads_valid (threads) && (opt->limit < 0 || i < opt->offset + opt->limit);
 	 notmuch_threads_move_to_next (threads), i++)
     {
 	thread = notmuch_threads_get (threads);
 
-	if (i < offset) {
+	if (i < opt->offset) {
 	    notmuch_thread_destroy (thread);
 	    continue;
 	}
 
-	if (output == OUTPUT_THREADS) {
+	if (opt->output == OUTPUT_THREADS) {
 	    format->set_prefix (format, "thread");
 	    format->string (format,
 			    notmuch_thread_get_thread_id (thread));
@@ -123,7 +129,7 @@ do_search_threads (sprinter_t *format,
 
 	    format->begin_map (format);
 
-	    if (sort == NOTMUCH_SORT_OLDEST_FIRST)
+	    if (opt->sort == NOTMUCH_SORT_OLDEST_FIRST)
 		date = notmuch_thread_get_oldest_date (thread);
 	    else
 		date = notmuch_thread_get_newest_date (thread);
@@ -215,40 +221,36 @@ do_search_threads (sprinter_t *format,
 }
 
 static int
-do_search_messages (sprinter_t *format,
-		    notmuch_query_t *query,
-		    output_t output,
-		    int offset,
-		    int limit,
-		    int dupe)
+do_search_messages (search_options_t *opt)
 {
     notmuch_message_t *message;
     notmuch_messages_t *messages;
     notmuch_filenames_t *filenames;
+    sprinter_t *format = opt->format;
     int i;
 
-    if (offset < 0) {
-	offset += notmuch_query_count_messages (query);
-	if (offset < 0)
-	    offset = 0;
+    if (opt->offset < 0) {
+	opt->offset += notmuch_query_count_messages (opt->query);
+	if (opt->offset < 0)
+	    opt->offset = 0;
     }
 
-    messages = notmuch_query_search_messages (query);
+    messages = notmuch_query_search_messages (opt->query);
     if (messages == NULL)
 	return 1;
 
     format->begin_list (format);
 
     for (i = 0;
-	 notmuch_messages_valid (messages) && (limit < 0 || i < offset + limit);
+	 notmuch_messages_valid (messages) && (opt->limit < 0 || i < opt->offset + opt->limit);
 	 notmuch_messages_move_to_next (messages), i++)
     {
-	if (i < offset)
+	if (i < opt->offset)
 	    continue;
 
 	message = notmuch_messages_get (messages);
 
-	if (output == OUTPUT_FILES) {
+	if (opt->output == OUTPUT_FILES) {
 	    int j;
 	    filenames = notmuch_message_get_filenames (message);
 
@@ -256,7 +258,7 @@ do_search_messages (sprinter_t *format,
 		 notmuch_filenames_valid (filenames);
 		 notmuch_filenames_move_to_next (filenames), j++)
 	    {
-		if (dupe < 0 || dupe == j) {
+		if (opt->dupe < 0 || opt->dupe == j) {
 		    format->string (format, notmuch_filenames_get (filenames));
 		    format->separator (format);
 		}
@@ -283,12 +285,13 @@ do_search_messages (sprinter_t *format,
 
 static int
 do_search_tags (notmuch_database_t *notmuch,
-		sprinter_t *format,
-		notmuch_query_t *query)
+		const search_options_t *opt)
 {
     notmuch_messages_t *messages = NULL;
     notmuch_tags_t *tags;
     const char *tag;
+    sprinter_t *format = opt->format;
+    notmuch_query_t *query = opt->query;
 
     /* should the following only special case if no excluded terms
      * specified? */
@@ -333,16 +336,16 @@ int
 notmuch_search_command (notmuch_config_t *config, int argc, char *argv[])
 {
     notmuch_database_t *notmuch;
-    notmuch_query_t *query;
+    search_options_t opt = {
+	.sort = NOTMUCH_SORT_NEWEST_FIRST,
+	.output = OUTPUT_SUMMARY,
+	.offset = 0,
+	.limit = -1, /* unlimited */
+	.dupe = -1,
+    };
     char *query_str;
-    notmuch_sort_t sort = NOTMUCH_SORT_NEWEST_FIRST;
-    sprinter_t *format = NULL;
     int opt_index, ret;
-    output_t output = OUTPUT_SUMMARY;
-    int offset = 0;
-    int limit = -1; /* unlimited */
     notmuch_exclude_t exclude = NOTMUCH_EXCLUDE_TRUE;
-    int dupe = -1;
     unsigned int i;
 
     enum {
@@ -353,7 +356,7 @@ notmuch_search_command (notmuch_config_t *config, int argc, char *argv[])
     } format_sel = NOTMUCH_FORMAT_TEXT;
 
     notmuch_opt_desc_t options[] = {
-	{ NOTMUCH_OPT_KEYWORD, &sort, "sort", 's',
+	{ NOTMUCH_OPT_KEYWORD, &opt.sort, "sort", 's',
 	  (notmuch_keyword_t []){ { "oldest-first", NOTMUCH_SORT_OLDEST_FIRST },
 				  { "newest-first", NOTMUCH_SORT_NEWEST_FIRST },
 				  { 0, 0 } } },
@@ -364,7 +367,7 @@ notmuch_search_command (notmuch_config_t *config, int argc, char *argv[])
 				  { "text0", NOTMUCH_FORMAT_TEXT0 },
 				  { 0, 0 } } },
 	{ NOTMUCH_OPT_INT, &notmuch_format_version, "format-version", 0, 0 },
-	{ NOTMUCH_OPT_KEYWORD, &output, "output", 'o',
+	{ NOTMUCH_OPT_KEYWORD, &opt.output, "output", 'o',
 	  (notmuch_keyword_t []){ { "summary", OUTPUT_SUMMARY },
 				  { "threads", OUTPUT_THREADS },
 				  { "messages", OUTPUT_MESSAGES },
@@ -377,9 +380,9 @@ notmuch_search_command (notmuch_config_t *config, int argc, char *argv[])
                                   { "flag", NOTMUCH_EXCLUDE_FLAG },
                                   { "all", NOTMUCH_EXCLUDE_ALL },
                                   { 0, 0 } } },
-	{ NOTMUCH_OPT_INT, &offset, "offset", 'O', 0 },
-	{ NOTMUCH_OPT_INT, &limit, "limit", 'L', 0  },
-	{ NOTMUCH_OPT_INT, &dupe, "duplicate", 'D', 0  },
+	{ NOTMUCH_OPT_INT, &opt.offset, "offset", 'O', 0 },
+	{ NOTMUCH_OPT_INT, &opt.limit, "limit", 'L', 0  },
+	{ NOTMUCH_OPT_INT, &opt.dupe, "duplicate", 'D', 0  },
 	{ 0, 0, 0, 0, 0 }
     };
 
@@ -389,20 +392,20 @@ notmuch_search_command (notmuch_config_t *config, int argc, char *argv[])
 
     switch (format_sel) {
     case NOTMUCH_FORMAT_TEXT:
-	format = sprinter_text_create (config, stdout);
+	opt.format = sprinter_text_create (config, stdout);
 	break;
     case NOTMUCH_FORMAT_TEXT0:
-	if (output == OUTPUT_SUMMARY) {
+	if (opt.output == OUTPUT_SUMMARY) {
 	    fprintf (stderr, "Error: --format=text0 is not compatible with --output=summary.\n");
 	    return EXIT_FAILURE;
 	}
-	format = sprinter_text0_create (config, stdout);
+	opt.format = sprinter_text0_create (config, stdout);
 	break;
     case NOTMUCH_FORMAT_JSON:
-	format = sprinter_json_create (config, stdout);
+	opt.format = sprinter_json_create (config, stdout);
 	break;
     case NOTMUCH_FORMAT_SEXP:
-	format = sprinter_sexp_create (config, stdout);
+	opt.format = sprinter_sexp_create (config, stdout);
 	break;
     default:
 	/* this should never happen */
@@ -425,15 +428,15 @@ notmuch_search_command (notmuch_config_t *config, int argc, char *argv[])
 	return EXIT_FAILURE;
     }
 
-    query = notmuch_query_create (notmuch, query_str);
-    if (query == NULL) {
+    opt.query = notmuch_query_create (notmuch, query_str);
+    if (opt.query == NULL) {
 	fprintf (stderr, "Out of memory\n");
 	return EXIT_FAILURE;
     }
 
-    notmuch_query_set_sort (query, sort);
+    notmuch_query_set_sort (opt.query, opt.sort);
 
-    if (exclude == NOTMUCH_EXCLUDE_FLAG && output != OUTPUT_SUMMARY) {
+    if (exclude == NOTMUCH_EXCLUDE_FLAG && opt.output != OUTPUT_SUMMARY) {
 	/* If we are not doing summary output there is nowhere to
 	 * print the excluded flag so fall back on including the
 	 * excluded messages. */
@@ -448,29 +451,29 @@ notmuch_search_command (notmuch_config_t *config, int argc, char *argv[])
 	search_exclude_tags = notmuch_config_get_search_exclude_tags
 	    (config, &search_exclude_tags_length);
 	for (i = 0; i < search_exclude_tags_length; i++)
-	    notmuch_query_add_tag_exclude (query, search_exclude_tags[i]);
-	notmuch_query_set_omit_excluded (query, exclude);
+	    notmuch_query_add_tag_exclude (opt.query, search_exclude_tags[i]);
+	notmuch_query_set_omit_excluded (opt.query, exclude);
     }
 
-    switch (output) {
+    switch (opt.output) {
     default:
     case OUTPUT_SUMMARY:
     case OUTPUT_THREADS:
-	ret = do_search_threads (format, query, sort, output, offset, limit);
+	ret = do_search_threads (&opt);
 	break;
     case OUTPUT_MESSAGES:
     case OUTPUT_FILES:
-	ret = do_search_messages (format, query, output, offset, limit, dupe);
+	ret = do_search_messages (&opt);
 	break;
     case OUTPUT_TAGS:
-	ret = do_search_tags (notmuch, format, query);
+	ret = do_search_tags (notmuch, &opt);
 	break;
     }
 
-    notmuch_query_destroy (query);
+    notmuch_query_destroy (opt.query);
     notmuch_database_destroy (notmuch);
 
-    talloc_free (format);
+    talloc_free (opt.format);
 
     return ret ? EXIT_FAILURE : EXIT_SUCCESS;
 }
-- 
2.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 2/7] cli: Add support for parsing keyword-flag arguments
  2014-10-30 23:59 [PATCH v5 0/7] notmuch search --output=sender/recipients Michal Sojka
  2014-10-30 23:59 ` [PATCH v5 1/7] cli: search: Refactor passing of command line options Michal Sojka
@ 2014-10-30 23:59 ` Michal Sojka
  2014-10-30 23:59 ` [PATCH v5 3/7] cli: search: Convert --output to keyword-flag argument Michal Sojka
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Michal Sojka @ 2014-10-30 23:59 UTC (permalink / raw)
  To: notmuch

From: Jani Nikula <jani@nikula.org>

This allows having multiple --foo=bar --foo=baz options on the command
line, with the corresponding values OR'd together.

[Test added by Michal Sojka]
---
 command-line-arguments.c      | 6 +++++-
 command-line-arguments.h      | 1 +
 test/T410-argument-parsing.sh | 3 ++-
 test/arg-test.c               | 9 +++++++++
 4 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/command-line-arguments.c b/command-line-arguments.c
index 844d6c3..c6f7269 100644
--- a/command-line-arguments.c
+++ b/command-line-arguments.c
@@ -23,7 +23,10 @@ _process_keyword_arg (const notmuch_opt_desc_t *arg_desc, char next, const char
     while (keywords->name) {
 	if (strcmp (arg_str, keywords->name) == 0) {
 	    if (arg_desc->output_var) {
-		*((int *)arg_desc->output_var) = keywords->value;
+		if (arg_desc->opt_type == NOTMUCH_OPT_KEYWORD_FLAGS)
+		    *((int *)arg_desc->output_var) |= keywords->value;
+		else
+		    *((int *)arg_desc->output_var) = keywords->value;
 	    }
 	    return TRUE;
 	}
@@ -152,6 +155,7 @@ parse_option (const char *arg,
 
 	switch (try->opt_type) {
 	case NOTMUCH_OPT_KEYWORD:
+	case NOTMUCH_OPT_KEYWORD_FLAGS:
 	    return _process_keyword_arg (try, next, value);
 	case NOTMUCH_OPT_BOOLEAN:
 	    return _process_boolean_arg (try, next, value);
diff --git a/command-line-arguments.h b/command-line-arguments.h
index de1734a..6444129 100644
--- a/command-line-arguments.h
+++ b/command-line-arguments.h
@@ -8,6 +8,7 @@ enum notmuch_opt_type {
     NOTMUCH_OPT_BOOLEAN,	/* --verbose              */
     NOTMUCH_OPT_INT,		/* --frob=8               */
     NOTMUCH_OPT_KEYWORD,	/* --format=raw|json|text */
+    NOTMUCH_OPT_KEYWORD_FLAGS,	/* the above with values OR'd together */
     NOTMUCH_OPT_STRING,		/* --file=/tmp/gnarf.txt  */
     NOTMUCH_OPT_POSITION	/* notmuch dump pos_arg   */
 };
diff --git a/test/T410-argument-parsing.sh b/test/T410-argument-parsing.sh
index 94e9087..2e5d7ae 100755
--- a/test/T410-argument-parsing.sh
+++ b/test/T410-argument-parsing.sh
@@ -3,9 +3,10 @@ test_description="argument parsing"
 . ./test-lib.sh
 
 test_begin_subtest "sanity check"
-$TEST_DIRECTORY/arg-test  pos1  --keyword=one --string=foo pos2 --int=7 > OUTPUT
+$TEST_DIRECTORY/arg-test  pos1  --keyword=one --string=foo pos2 --int=7 --flag=one --flag=three > OUTPUT
 cat <<EOF > EXPECTED
 keyword 1
+flags 5
 int 7
 string foo
 positional arg 1 pos1
diff --git a/test/arg-test.c b/test/arg-test.c
index 6c49eac..736686d 100644
--- a/test/arg-test.c
+++ b/test/arg-test.c
@@ -7,6 +7,7 @@ int main(int argc, char **argv){
     int opt_index=1;
 
     int kw_val=0;
+    int fl_val=0;
     int int_val=0;
     char *pos_arg1=NULL;
     char *pos_arg2=NULL;
@@ -17,6 +18,11 @@ int main(int argc, char **argv){
 	  (notmuch_keyword_t []){ { "one", 1 },
 				  { "two", 2 },
 				  { 0, 0 } } },
+	{ NOTMUCH_OPT_KEYWORD_FLAGS, &fl_val, "flag", 'f',
+	  (notmuch_keyword_t []){ { "one",   1 << 0},
+				  { "two",   1 << 1 },
+				  { "three", 1 << 2 },
+				  { 0, 0 } } },
 	{ NOTMUCH_OPT_INT, &int_val, "int", 'i', 0},
 	{ NOTMUCH_OPT_STRING, &string_val, "string", 's', 0},
 	{ NOTMUCH_OPT_POSITION, &pos_arg1, 0,0, 0},
@@ -31,6 +37,9 @@ int main(int argc, char **argv){
     if (kw_val)
 	printf("keyword %d\n", kw_val);
 
+    if (fl_val)
+	printf("flags %d\n", fl_val);
+
     if (int_val)
 	printf("int %d\n", int_val);
 
-- 
2.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 3/7] cli: search: Convert --output to keyword-flag argument
  2014-10-30 23:59 [PATCH v5 0/7] notmuch search --output=sender/recipients Michal Sojka
  2014-10-30 23:59 ` [PATCH v5 1/7] cli: search: Refactor passing of command line options Michal Sojka
  2014-10-30 23:59 ` [PATCH v5 2/7] cli: Add support for parsing keyword-flag arguments Michal Sojka
@ 2014-10-30 23:59 ` Michal Sojka
  2014-10-30 23:59 ` [PATCH v5 4/7] cli: search: Add --output={sender,recipients} Michal Sojka
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Michal Sojka @ 2014-10-30 23:59 UTC (permalink / raw)
  To: notmuch

This converts "notmuch search" to use the recently introduced
keyword-flag argument parser. At this point, it only makes the code
slightly less readable but following commits that add new --output
keywords will profit from this.
---
 notmuch-search.c | 35 ++++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/notmuch-search.c b/notmuch-search.c
index 0c3e972..ce46877 100644
--- a/notmuch-search.c
+++ b/notmuch-search.c
@@ -23,11 +23,11 @@
 #include "string-util.h"
 
 typedef enum {
-    OUTPUT_SUMMARY,
-    OUTPUT_THREADS,
-    OUTPUT_MESSAGES,
-    OUTPUT_FILES,
-    OUTPUT_TAGS
+    OUTPUT_SUMMARY	= 1 << 0,
+    OUTPUT_THREADS	= 1 << 1,
+    OUTPUT_MESSAGES	= 1 << 2,
+    OUTPUT_FILES	= 1 << 3,
+    OUTPUT_TAGS		= 1 << 4,
 } output_t;
 
 typedef struct {
@@ -338,7 +338,7 @@ notmuch_search_command (notmuch_config_t *config, int argc, char *argv[])
     notmuch_database_t *notmuch;
     search_options_t opt = {
 	.sort = NOTMUCH_SORT_NEWEST_FIRST,
-	.output = OUTPUT_SUMMARY,
+	.output = 0,
 	.offset = 0,
 	.limit = -1, /* unlimited */
 	.dupe = -1,
@@ -367,7 +367,7 @@ notmuch_search_command (notmuch_config_t *config, int argc, char *argv[])
 				  { "text0", NOTMUCH_FORMAT_TEXT0 },
 				  { 0, 0 } } },
 	{ NOTMUCH_OPT_INT, &notmuch_format_version, "format-version", 0, 0 },
-	{ NOTMUCH_OPT_KEYWORD, &opt.output, "output", 'o',
+	{ NOTMUCH_OPT_KEYWORD_FLAGS, &opt.output, "output", 'o',
 	  (notmuch_keyword_t []){ { "summary", OUTPUT_SUMMARY },
 				  { "threads", OUTPUT_THREADS },
 				  { "messages", OUTPUT_MESSAGES },
@@ -390,6 +390,9 @@ notmuch_search_command (notmuch_config_t *config, int argc, char *argv[])
     if (opt_index < 0)
 	return EXIT_FAILURE;
 
+    if (! opt.output)
+	opt.output = OUTPUT_SUMMARY;
+
     switch (format_sel) {
     case NOTMUCH_FORMAT_TEXT:
 	opt.format = sprinter_text_create (config, stdout);
@@ -455,19 +458,17 @@ notmuch_search_command (notmuch_config_t *config, int argc, char *argv[])
 	notmuch_query_set_omit_excluded (opt.query, exclude);
     }
 
-    switch (opt.output) {
-    default:
-    case OUTPUT_SUMMARY:
-    case OUTPUT_THREADS:
+    if (opt.output == OUTPUT_SUMMARY ||
+	opt.output == OUTPUT_THREADS)
 	ret = do_search_threads (&opt);
-	break;
-    case OUTPUT_MESSAGES:
-    case OUTPUT_FILES:
+    else if (opt.output == OUTPUT_MESSAGES ||
+	     opt.output == OUTPUT_FILES)
 	ret = do_search_messages (&opt);
-	break;
-    case OUTPUT_TAGS:
+    else if (opt.output == OUTPUT_TAGS)
 	ret = do_search_tags (notmuch, &opt);
-	break;
+    else {
+	fprintf (stderr, "Error: the combination of outputs is not supported.\n");
+	ret = 1;
     }
 
     notmuch_query_destroy (opt.query);
-- 
2.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 4/7] cli: search: Add --output={sender,recipients}
  2014-10-30 23:59 [PATCH v5 0/7] notmuch search --output=sender/recipients Michal Sojka
                   ` (2 preceding siblings ...)
  2014-10-30 23:59 ` [PATCH v5 3/7] cli: search: Convert --output to keyword-flag argument Michal Sojka
@ 2014-10-30 23:59 ` Michal Sojka
  2014-10-30 23:59 ` [PATCH v5 5/7] cli: search: Do not output duplicate addresses Michal Sojka
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Michal Sojka @ 2014-10-30 23:59 UTC (permalink / raw)
  To: notmuch

The new outputs allow printing senders, recipients or both of matching
messages. To print both, the user can use --output=sender and
--output=recipients simultaneously.

Currently, the same address can appear multiple times in the output.
The next commit will change this. For this reason, tests are
introduced there.

We use mailbox_t rather than InternetAddressMailbox because we will
need to extend it in a following commit.

This code is based on a patch from Jani Nikula.
---
 completion/notmuch-completion.bash |   2 +-
 completion/notmuch-completion.zsh  |   3 +-
 doc/man1/notmuch-search.rst        |  22 ++++++-
 notmuch-search.c                   | 115 ++++++++++++++++++++++++++++++++++++-
 4 files changed, 137 insertions(+), 5 deletions(-)

diff --git a/completion/notmuch-completion.bash b/completion/notmuch-completion.bash
index 0571dc9..cfbd389 100644
--- a/completion/notmuch-completion.bash
+++ b/completion/notmuch-completion.bash
@@ -294,7 +294,7 @@ _notmuch_search()
 	    return
 	    ;;
 	--output)
-	    COMPREPLY=( $( compgen -W "summary threads messages files tags" -- "${cur}" ) )
+	    COMPREPLY=( $( compgen -W "summary threads messages files tags sender recipients" -- "${cur}" ) )
 	    return
 	    ;;
 	--sort)
diff --git a/completion/notmuch-completion.zsh b/completion/notmuch-completion.zsh
index 67a9aba..3e52a00 100644
--- a/completion/notmuch-completion.zsh
+++ b/completion/notmuch-completion.zsh
@@ -52,7 +52,8 @@ _notmuch_search()
   _arguments -s : \
     '--max-threads=[display only the first x threads from the search results]:number of threads to show: ' \
     '--first=[omit the first x threads from the search results]:number of threads to omit: ' \
-    '--sort=[sort results]:sorting:((newest-first\:"reverse chronological order" oldest-first\:"chronological order"))'
+    '--sort=[sort results]:sorting:((newest-first\:"reverse chronological order" oldest-first\:"chronological order"))' \
+    '--output=[select what to output]:output:((summary threads messages files tags sender recipients))'
 }
 
 _notmuch()
diff --git a/doc/man1/notmuch-search.rst b/doc/man1/notmuch-search.rst
index 90160f2..b6607c9 100644
--- a/doc/man1/notmuch-search.rst
+++ b/doc/man1/notmuch-search.rst
@@ -35,7 +35,7 @@ Supported options for **search** include
         intended for programs that invoke **notmuch(1)** internally. If
         omitted, the latest supported version will be used.
 
-    ``--output=(summary|threads|messages|files|tags)``
+    ``--output=(summary|threads|messages|files|tags|sender|recipients)``
 
         **summary**
             Output a summary of each thread with any message matching
@@ -78,6 +78,26 @@ Supported options for **search** include
             by null characters (--format=text0), as a JSON array
             (--format=json), or as an S-Expression list (--format=sexp).
 
+	**sender**
+            Output all addresses from the *From* header that appear on
+            any message matching the search terms, either one per line
+            (--format=text), separated by null characters
+            (--format=text0), as a JSON array (--format=json), or as
+            an S-Expression list (--format=sexp).
+
+	    Note: Searching for **sender** should be much faster than
+	    searching for **recipients**, because sender addresses are
+	    cached directly in the database whereas other addresses
+	    need to be fetched from message files.
+
+	**recipients**
+            Like **sender** but for addresses from *To*, *Cc* and
+	    *Bcc* headers.
+
+	This option can be given multiple times to combine different
+	outputs. Currently, this is only supported for **sender** and
+	**recipients** outputs.
+
     ``--sort=``\ (**newest-first**\ \|\ **oldest-first**)
         This option can be used to present results in either
         chronological order (**oldest-first**) or reverse chronological
diff --git a/notmuch-search.c b/notmuch-search.c
index ce46877..31e4221 100644
--- a/notmuch-search.c
+++ b/notmuch-search.c
@@ -28,8 +28,12 @@ typedef enum {
     OUTPUT_MESSAGES	= 1 << 2,
     OUTPUT_FILES	= 1 << 3,
     OUTPUT_TAGS		= 1 << 4,
+    OUTPUT_SENDER	= 1 << 5,
+    OUTPUT_RECIPIENTS	= 1 << 6,
 } output_t;
 
+#define OUTPUT_ADDRESS_FLAGS (OUTPUT_SENDER | OUTPUT_RECIPIENTS)
+
 typedef struct {
     sprinter_t *format;
     notmuch_query_t *query;
@@ -40,6 +44,11 @@ typedef struct {
     int dupe;
 } search_options_t;
 
+typedef struct {
+    const char *name;
+    const char *addr;
+} mailbox_t;
+
 /* Return two stable query strings that identify exactly the matched
  * and unmatched messages currently in thread.  If there are no
  * matched or unmatched messages, the returned buffers will be
@@ -220,6 +229,87 @@ do_search_threads (search_options_t *opt)
     return 0;
 }
 
+static void
+print_mailbox (const search_options_t *opt, const mailbox_t *mailbox)
+{
+    const char *name = mailbox->name;
+    const char *addr = mailbox->addr;
+    sprinter_t *format = opt->format;
+
+    if (format->is_text_printer) {
+	char *mailbox_str;
+
+	if (name && *name)
+	    mailbox_str = talloc_asprintf (format, "%s <%s>", name, addr);
+	else
+	    mailbox_str = talloc_strdup (format, addr);
+
+	if (! mailbox_str) {
+	    fprintf (stderr, "Error: out of memory\n");
+	    return;
+	}
+	format->string (format, mailbox_str);
+	format->separator (format);
+
+	talloc_free (mailbox_str);
+    } else {
+	format->begin_map (format);
+	format->map_key (format, "name");
+	format->string (format, name);
+	format->map_key (format, "address");
+	format->string (format, addr);
+	format->end (format);
+	format->separator (format);
+    }
+}
+
+/* Print addresses from InternetAddressList.  */
+static void
+process_address_list (const search_options_t *opt, InternetAddressList *list)
+{
+    InternetAddress *address;
+    int i;
+
+    for (i = 0; i < internet_address_list_length (list); i++) {
+	address = internet_address_list_get_address (list, i);
+	if (INTERNET_ADDRESS_IS_GROUP (address)) {
+	    InternetAddressGroup *group;
+	    InternetAddressList *group_list;
+
+	    group = INTERNET_ADDRESS_GROUP (address);
+	    group_list = internet_address_group_get_members (group);
+	    if (group_list == NULL)
+		continue;
+
+	    process_address_list (opt, group_list);
+	} else {
+	    InternetAddressMailbox *mailbox = INTERNET_ADDRESS_MAILBOX (address);
+	    mailbox_t mbx = {
+		.name = internet_address_get_name (address),
+		.addr = internet_address_mailbox_get_addr (mailbox),
+	    };
+
+	    print_mailbox (opt, &mbx);
+	}
+    }
+}
+
+/* Print addresses from a message header.  */
+static void
+process_address_header (const search_options_t *opt, const char *value)
+{
+    InternetAddressList *list;
+
+    if (value == NULL)
+	return;
+
+    list = internet_address_list_parse_string (value);
+    if (list == NULL)
+	return;
+
+    process_address_list (opt, list);
+}
+
 static int
 do_search_messages (search_options_t *opt)
 {
@@ -266,11 +356,29 @@ do_search_messages (search_options_t *opt)
 	    
 	    notmuch_filenames_destroy( filenames );
 
-	} else { /* output == OUTPUT_MESSAGES */
+	} else if (opt->output == OUTPUT_MESSAGES) {
 	    format->set_prefix (format, "id");
 	    format->string (format,
 			    notmuch_message_get_message_id (message));
 	    format->separator (format);
+	} else {
+	    if (opt->output & OUTPUT_SENDER) {
+		const char *addrs;
+
+		addrs = notmuch_message_get_header (message, "from");
+		process_address_header (opt, addrs);
+	    }
+
+	    if (opt->output & OUTPUT_RECIPIENTS) {
+		const char *hdrs[] = { "to", "cc", "bcc" };
+		const char *addrs;
+		size_t j;
+
+		for (j = 0; j < ARRAY_SIZE (hdrs); j++) {
+		    addrs = notmuch_message_get_header (message, hdrs[j]);
+		    process_address_header (opt, addrs);
+		}
+	    }
 	}
 
 	notmuch_message_destroy (message);
@@ -371,6 +479,8 @@ notmuch_search_command (notmuch_config_t *config, int argc, char *argv[])
 	  (notmuch_keyword_t []){ { "summary", OUTPUT_SUMMARY },
 				  { "threads", OUTPUT_THREADS },
 				  { "messages", OUTPUT_MESSAGES },
+				  { "sender", OUTPUT_SENDER },
+				  { "recipients", OUTPUT_RECIPIENTS },
 				  { "files", OUTPUT_FILES },
 				  { "tags", OUTPUT_TAGS },
 				  { 0, 0 } } },
@@ -462,7 +572,8 @@ notmuch_search_command (notmuch_config_t *config, int argc, char *argv[])
 	opt.output == OUTPUT_THREADS)
 	ret = do_search_threads (&opt);
     else if (opt.output == OUTPUT_MESSAGES ||
-	     opt.output == OUTPUT_FILES)
+	     opt.output == OUTPUT_FILES ||
+	     (opt.output & OUTPUT_ADDRESS_FLAGS && !(opt.output & ~OUTPUT_ADDRESS_FLAGS)))
 	ret = do_search_messages (&opt);
     else if (opt.output == OUTPUT_TAGS)
 	ret = do_search_tags (notmuch, &opt);
-- 
2.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 5/7] cli: search: Do not output duplicate addresses
  2014-10-30 23:59 [PATCH v5 0/7] notmuch search --output=sender/recipients Michal Sojka
                   ` (3 preceding siblings ...)
  2014-10-30 23:59 ` [PATCH v5 4/7] cli: search: Add --output={sender,recipients} Michal Sojka
@ 2014-10-30 23:59 ` Michal Sojka
  2014-10-30 23:59 ` [PATCH v5 6/7] cli: search: Add --output=count Michal Sojka
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Michal Sojka @ 2014-10-30 23:59 UTC (permalink / raw)
  To: notmuch

This filters out duplicate addresses from address outputs (sender,
receivers).

It also also adds tests for the new outputs.

The code here is an extended version of a patch from Jani Nikula.
---
 doc/man1/notmuch-search.rst |  2 ++
 notmuch-search.c            | 51 ++++++++++++++++++++++----
 test/T090-search-output.sh  | 87 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 134 insertions(+), 6 deletions(-)

diff --git a/doc/man1/notmuch-search.rst b/doc/man1/notmuch-search.rst
index b6607c9..42f17e4 100644
--- a/doc/man1/notmuch-search.rst
+++ b/doc/man1/notmuch-search.rst
@@ -85,6 +85,8 @@ Supported options for **search** include
             (--format=text0), as a JSON array (--format=json), or as
             an S-Expression list (--format=sexp).
 
+            Duplicate addresses are filtered out.
+
 	    Note: Searching for **sender** should be much faster than
 	    searching for **recipients**, because sender addresses are
 	    cached directly in the database whereas other addresses
diff --git a/notmuch-search.c b/notmuch-search.c
index 31e4221..eae749a 100644
--- a/notmuch-search.c
+++ b/notmuch-search.c
@@ -229,6 +229,27 @@ do_search_threads (search_options_t *opt)
     return 0;
 }
 
+/* Returns TRUE iff name and addr is duplicate. */
+static notmuch_bool_t
+is_duplicate (const search_options_t *opt, GHashTable *addrs, const char *name, const char *addr)
+{
+    notmuch_bool_t duplicate;
+    char *key;
+
+    key = talloc_asprintf (opt->format, "%s <%s>", name, addr);
+    if (! key)
+	return FALSE;
+
+    duplicate = g_hash_table_lookup_extended (addrs, key, NULL, NULL);
+
+    if (! duplicate)
+	g_hash_table_insert (addrs, key, NULL);
+    else
+	talloc_free (key);
+
+    return duplicate;
+}
+
 static void
 print_mailbox (const search_options_t *opt, const mailbox_t *mailbox)
 {
@@ -265,7 +286,8 @@ print_mailbox (const search_options_t *opt, const mailbox_t *mailbox)
 
 /* Print addresses from InternetAddressList.  */
 static void
-process_address_list (const search_options_t *opt, InternetAddressList *list)
+process_address_list (const search_options_t *opt, GHashTable *addrs,
+		      InternetAddressList *list)
 {
     InternetAddress *address;
     int i;
@@ -281,7 +303,7 @@ process_address_list (const search_options_t *opt, InternetAddressList *list)
 	    if (group_list == NULL)
 		continue;
 
-	    process_address_list (opt, group_list);
+	    process_address_list (opt, addrs, group_list);
 	} else {
 	    InternetAddressMailbox *mailbox = INTERNET_ADDRESS_MAILBOX (address);
 	    mailbox_t mbx = {
@@ -289,6 +311,9 @@ process_address_list (const search_options_t *opt, InternetAddressList *list)
 		.addr = internet_address_mailbox_get_addr (mailbox),
 	    };
 
+	    if (is_duplicate (opt, addrs, mbx.name, mbx.addr))
+		continue;
+
 	    print_mailbox (opt, &mbx);
 	}
     }
@@ -296,7 +321,7 @@ process_address_list (const search_options_t *opt, InternetAddressList *list)
 
 /* Print addresses from a message header.  */
 static void
-process_address_header (const search_options_t *opt, const char *value)
+process_address_header (const search_options_t *opt, GHashTable *addrs, const char *value)
 {
     InternetAddressList *list;
 
@@ -307,7 +332,13 @@ process_address_header (const search_options_t *opt, const char *value)
     if (list == NULL)
 	return;
 
-    process_address_list (opt, list);
+    process_address_list (opt, addrs, list);
+}
+
+static void
+_my_talloc_free_for_g_hash (void *ptr)
+{
+    talloc_free (ptr);
 }
 
 static int
@@ -317,8 +348,13 @@ do_search_messages (search_options_t *opt)
     notmuch_messages_t *messages;
     notmuch_filenames_t *filenames;
     sprinter_t *format = opt->format;
+    GHashTable *addresses = NULL;
     int i;
 
+    if (opt->output & OUTPUT_ADDRESS_FLAGS)
+	addresses = g_hash_table_new_full (g_str_hash, g_str_equal,
+					   _my_talloc_free_for_g_hash, NULL);
+
     if (opt->offset < 0) {
 	opt->offset += notmuch_query_count_messages (opt->query);
 	if (opt->offset < 0)
@@ -366,7 +402,7 @@ do_search_messages (search_options_t *opt)
 		const char *addrs;
 
 		addrs = notmuch_message_get_header (message, "from");
-		process_address_header (opt, addrs);
+		process_address_header (opt, addresses, addrs);
 	    }
 
 	    if (opt->output & OUTPUT_RECIPIENTS) {
@@ -376,7 +412,7 @@ do_search_messages (search_options_t *opt)
 
 		for (j = 0; j < ARRAY_SIZE (hdrs); j++) {
 		    addrs = notmuch_message_get_header (message, hdrs[j]);
-		    process_address_header (opt, addrs);
+		    process_address_header (opt, addresses, addrs);
 		}
 	    }
 	}
@@ -384,6 +420,9 @@ do_search_messages (search_options_t *opt)
 	notmuch_message_destroy (message);
     }
 
+    if (addresses)
+	g_hash_table_unref (addresses);
+
     notmuch_messages_destroy (messages);
 
     format->end (format);
diff --git a/test/T090-search-output.sh b/test/T090-search-output.sh
index 947d572..841a721 100755
--- a/test/T090-search-output.sh
+++ b/test/T090-search-output.sh
@@ -387,6 +387,93 @@ cat <<EOF >EXPECTED
 EOF
 test_expect_equal_file OUTPUT EXPECTED
 
+test_begin_subtest "--output=sender"
+notmuch search --output=sender '*' >OUTPUT
+cat <<EOF >EXPECTED
+François Boulogne <boulogne.f@gmail.com>
+Olivier Berger <olivier.berger@it-sudparis.eu>
+Chris Wilson <chris@chris-wilson.co.uk>
+Carl Worth <cworth@cworth.org>
+Alexander Botero-Lowry <alex.boterolowry@gmail.com>
+Keith Packard <keithp@keithp.com>
+Jjgod Jiang <gzjjgod@gmail.com>
+Rolland Santimano <rollandsantimano@yahoo.com>
+Jan Janak <jan@ryngle.com>
+Stewart Smith <stewart@flamingspork.com>
+Lars Kellogg-Stedman <lars@seas.harvard.edu>
+Alex Botero-Lowry <alex.boterolowry@gmail.com>
+Ingmar Vanhassel <ingmar@exherbo.org>
+Aron Griffis <agriffis@n01se.net>
+Adrian Perez de Castro <aperez@igalia.com>
+Israel Herraiz <isra@herraiz.org>
+Mikhail Gusarov <dottedmag@dottedmag.net>
+EOF
+test_expect_equal_file OUTPUT EXPECTED
+
+test_begin_subtest "--output=sender --format=json"
+notmuch search --output=sender --format=json '*' >OUTPUT
+cat <<EOF >EXPECTED
+[{"name": "François Boulogne", "address": "boulogne.f@gmail.com"},
+{"name": "Olivier Berger", "address": "olivier.berger@it-sudparis.eu"},
+{"name": "Chris Wilson", "address": "chris@chris-wilson.co.uk"},
+{"name": "Carl Worth", "address": "cworth@cworth.org"},
+{"name": "Alexander Botero-Lowry", "address": "alex.boterolowry@gmail.com"},
+{"name": "Keith Packard", "address": "keithp@keithp.com"},
+{"name": "Jjgod Jiang", "address": "gzjjgod@gmail.com"},
+{"name": "Rolland Santimano", "address": "rollandsantimano@yahoo.com"},
+{"name": "Jan Janak", "address": "jan@ryngle.com"},
+{"name": "Stewart Smith", "address": "stewart@flamingspork.com"},
+{"name": "Lars Kellogg-Stedman", "address": "lars@seas.harvard.edu"},
+{"name": "Alex Botero-Lowry", "address": "alex.boterolowry@gmail.com"},
+{"name": "Ingmar Vanhassel", "address": "ingmar@exherbo.org"},
+{"name": "Aron Griffis", "address": "agriffis@n01se.net"},
+{"name": "Adrian Perez de Castro", "address": "aperez@igalia.com"},
+{"name": "Israel Herraiz", "address": "isra@herraiz.org"},
+{"name": "Mikhail Gusarov", "address": "dottedmag@dottedmag.net"}]
+EOF
+test_expect_equal_file OUTPUT EXPECTED
+
+test_begin_subtest "--output=recipients"
+notmuch search --output=recipients '*' >OUTPUT
+cat <<EOF >EXPECTED
+Allan McRae <allan@archlinux.org>
+Discussion about the Arch User Repository (AUR) <aur-general@archlinux.org>
+olivier.berger@it-sudparis.eu
+notmuch@notmuchmail.org
+notmuch <notmuch@notmuchmail.org>
+Keith Packard <keithp@keithp.com>
+Mikhail Gusarov <dottedmag@dottedmag.net>
+EOF
+test_expect_equal_file OUTPUT EXPECTED
+
+test_begin_subtest "--output=sender --output=recipients"
+notmuch search --output=sender --output=recipients '*' >OUTPUT
+cat <<EOF >EXPECTED
+François Boulogne <boulogne.f@gmail.com>
+Allan McRae <allan@archlinux.org>
+Discussion about the Arch User Repository (AUR) <aur-general@archlinux.org>
+Olivier Berger <olivier.berger@it-sudparis.eu>
+olivier.berger@it-sudparis.eu
+Chris Wilson <chris@chris-wilson.co.uk>
+notmuch@notmuchmail.org
+Carl Worth <cworth@cworth.org>
+Alexander Botero-Lowry <alex.boterolowry@gmail.com>
+Keith Packard <keithp@keithp.com>
+Jjgod Jiang <gzjjgod@gmail.com>
+Rolland Santimano <rollandsantimano@yahoo.com>
+Jan Janak <jan@ryngle.com>
+Stewart Smith <stewart@flamingspork.com>
+Lars Kellogg-Stedman <lars@seas.harvard.edu>
+notmuch <notmuch@notmuchmail.org>
+Alex Botero-Lowry <alex.boterolowry@gmail.com>
+Ingmar Vanhassel <ingmar@exherbo.org>
+Aron Griffis <agriffis@n01se.net>
+Adrian Perez de Castro <aperez@igalia.com>
+Israel Herraiz <isra@herraiz.org>
+Mikhail Gusarov <dottedmag@dottedmag.net>
+EOF
+test_expect_equal_file OUTPUT EXPECTED
+
 test_begin_subtest "sanitize output for quoted-printable line-breaks in author and subject"
 add_message "[subject]='two =?ISO-8859-1?Q?line=0A_subject?=
 	headers'"
-- 
2.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 6/7] cli: search: Add --output=count
  2014-10-30 23:59 [PATCH v5 0/7] notmuch search --output=sender/recipients Michal Sojka
                   ` (4 preceding siblings ...)
  2014-10-30 23:59 ` [PATCH v5 5/7] cli: search: Do not output duplicate addresses Michal Sojka
@ 2014-10-30 23:59 ` Michal Sojka
  2014-10-30 23:59 ` [PATCH v5 7/7] cli: search: Add --filter-by option to configure address filtering Michal Sojka
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Michal Sojka @ 2014-10-30 23:59 UTC (permalink / raw)
  To: notmuch

This output can be used with --output=recipients or --output=sender
and in addition to the addresses, it prints how many times was each
address encountered during search.
---
 completion/notmuch-completion.bash |  2 +-
 completion/notmuch-completion.zsh  |  2 +-
 doc/man1/notmuch-search.rst        |  9 +++++--
 notmuch-search.c                   | 51 ++++++++++++++++++++++++++++++++------
 test/T090-search-output.sh         | 50 +++++++++++++++++++++++++++++++++++++
 5 files changed, 102 insertions(+), 12 deletions(-)

diff --git a/completion/notmuch-completion.bash b/completion/notmuch-completion.bash
index cfbd389..39cd829 100644
--- a/completion/notmuch-completion.bash
+++ b/completion/notmuch-completion.bash
@@ -294,7 +294,7 @@ _notmuch_search()
 	    return
 	    ;;
 	--output)
-	    COMPREPLY=( $( compgen -W "summary threads messages files tags sender recipients" -- "${cur}" ) )
+	    COMPREPLY=( $( compgen -W "summary threads messages files tags sender recipients count" -- "${cur}" ) )
 	    return
 	    ;;
 	--sort)
diff --git a/completion/notmuch-completion.zsh b/completion/notmuch-completion.zsh
index 3e52a00..d7e5a5e 100644
--- a/completion/notmuch-completion.zsh
+++ b/completion/notmuch-completion.zsh
@@ -53,7 +53,7 @@ _notmuch_search()
     '--max-threads=[display only the first x threads from the search results]:number of threads to show: ' \
     '--first=[omit the first x threads from the search results]:number of threads to omit: ' \
     '--sort=[sort results]:sorting:((newest-first\:"reverse chronological order" oldest-first\:"chronological order"))' \
-    '--output=[select what to output]:output:((summary threads messages files tags sender recipients))'
+    '--output=[select what to output]:output:((summary threads messages files tags sender recipients count))'
 }
 
 _notmuch()
diff --git a/doc/man1/notmuch-search.rst b/doc/man1/notmuch-search.rst
index 42f17e4..ec89200 100644
--- a/doc/man1/notmuch-search.rst
+++ b/doc/man1/notmuch-search.rst
@@ -96,9 +96,14 @@ Supported options for **search** include
             Like **sender** but for addresses from *To*, *Cc* and
 	    *Bcc* headers.
 
+	**count**
+	    Can be used in combination with **sender** or
+	    **recipients** to print the count of how many times was
+	    the address encountered during search.
+
 	This option can be given multiple times to combine different
-	outputs. Currently, this is only supported for **sender** and
-	**recipients** outputs.
+	outputs. Currently, this is only supported for **sender**,
+	**recipients** and **count** outputs.
 
     ``--sort=``\ (**newest-first**\ \|\ **oldest-first**)
         This option can be used to present results in either
diff --git a/notmuch-search.c b/notmuch-search.c
index eae749a..15527c4 100644
--- a/notmuch-search.c
+++ b/notmuch-search.c
@@ -30,9 +30,10 @@ typedef enum {
     OUTPUT_TAGS		= 1 << 4,
     OUTPUT_SENDER	= 1 << 5,
     OUTPUT_RECIPIENTS	= 1 << 6,
+    OUTPUT_COUNT	= 1 << 7,
 } output_t;
 
-#define OUTPUT_ADDRESS_FLAGS (OUTPUT_SENDER | OUTPUT_RECIPIENTS)
+#define OUTPUT_ADDRESS_FLAGS (OUTPUT_SENDER | OUTPUT_RECIPIENTS | OUTPUT_COUNT)
 
 typedef struct {
     sprinter_t *format;
@@ -47,6 +48,7 @@ typedef struct {
 typedef struct {
     const char *name;
     const char *addr;
+    int count;
 } mailbox_t;
 
 /* Return two stable query strings that identify exactly the matched
@@ -235,17 +237,24 @@ is_duplicate (const search_options_t *opt, GHashTable *addrs, const char *name,
 {
     notmuch_bool_t duplicate;
     char *key;
+    mailbox_t *mailbox;
 
     key = talloc_asprintf (opt->format, "%s <%s>", name, addr);
     if (! key)
 	return FALSE;
 
-    duplicate = g_hash_table_lookup_extended (addrs, key, NULL, NULL);
+    duplicate = g_hash_table_lookup_extended (addrs, key, NULL, (gpointer)&mailbox);
 
-    if (! duplicate)
-	g_hash_table_insert (addrs, key, NULL);
-    else
+    if (! duplicate) {
+	mailbox = talloc (opt->format, mailbox_t);
+	mailbox->name = talloc_strdup (mailbox, name);
+	mailbox->addr = talloc_strdup (mailbox, addr);
+	mailbox->count = 1;
+	g_hash_table_insert (addrs, key, mailbox);
+    } else {
+	mailbox->count++;
 	talloc_free (key);
+    }
 
     return duplicate;
 }
@@ -255,6 +264,7 @@ print_mailbox (const search_options_t *opt, const mailbox_t *mailbox)
 {
     const char *name = mailbox->name;
     const char *addr = mailbox->addr;
+    int count = mailbox->count;
     sprinter_t *format = opt->format;
 
     if (format->is_text_printer) {
@@ -269,6 +279,10 @@ print_mailbox (const search_options_t *opt, const mailbox_t *mailbox)
 	    fprintf (stderr, "Error: out of memory\n");
 	    return;
 	}
+	if (count > 0) {
+	    format->integer (format, count);
+	    format->string (format, "\t");
+	}
 	format->string (format, mailbox_str);
 	format->separator (format);
 
@@ -279,12 +293,16 @@ print_mailbox (const search_options_t *opt, const mailbox_t *mailbox)
 	format->string (format, name);
 	format->map_key (format, "address");
 	format->string (format, addr);
+	if (count > 0) {
+	    format->map_key (format, "count");
+	    format->integer (format, count);
+	}
 	format->end (format);
 	format->separator (format);
     }
 }
 
-/* Print addresses from InternetAddressList.  */
+/* Print or prepare for printing addresses from InternetAddressList. */
 static void
 process_address_list (const search_options_t *opt, GHashTable *addrs,
 		      InternetAddressList *list)
@@ -309,17 +327,21 @@ process_address_list (const search_options_t *opt, GHashTable *addrs,
 	    mailbox_t mbx = {
 		.name = internet_address_get_name (address),
 		.addr = internet_address_mailbox_get_addr (mailbox),
+		.count = 0,
 	    };
 
 	    if (is_duplicate (opt, addrs, mbx.name, mbx.addr))
 		continue;
 
+	    if (opt->output & OUTPUT_COUNT)
+		continue;
+
 	    print_mailbox (opt, &mbx);
 	}
     }
 }
 
-/* Print addresses from a message header.  */
+/* Print or prepare for printing addresses from a message header. */
 static void
 process_address_header (const search_options_t *opt, GHashTable *addrs, const char *value)
 {
@@ -341,6 +363,15 @@ _my_talloc_free_for_g_hash (void *ptr)
     talloc_free (ptr);
 }
 
+static void
+print_hash_value (unused (gpointer key), gpointer value, gpointer user_data)
+{
+    const mailbox_t *mailbox = value;
+    search_options_t *opt = user_data;
+
+    print_mailbox (opt, mailbox);
+}
+
 static int
 do_search_messages (search_options_t *opt)
 {
@@ -353,7 +384,7 @@ do_search_messages (search_options_t *opt)
 
     if (opt->output & OUTPUT_ADDRESS_FLAGS)
 	addresses = g_hash_table_new_full (g_str_hash, g_str_equal,
-					   _my_talloc_free_for_g_hash, NULL);
+					   _my_talloc_free_for_g_hash, _my_talloc_free_for_g_hash);
 
     if (opt->offset < 0) {
 	opt->offset += notmuch_query_count_messages (opt->query);
@@ -420,6 +451,9 @@ do_search_messages (search_options_t *opt)
 	notmuch_message_destroy (message);
     }
 
+    if (addresses && opt->output & OUTPUT_COUNT)
+	g_hash_table_foreach (addresses, print_hash_value, opt);
+
     if (addresses)
 	g_hash_table_unref (addresses);
 
@@ -522,6 +556,7 @@ notmuch_search_command (notmuch_config_t *config, int argc, char *argv[])
 				  { "recipients", OUTPUT_RECIPIENTS },
 				  { "files", OUTPUT_FILES },
 				  { "tags", OUTPUT_TAGS },
+				  { "count", OUTPUT_COUNT },
 				  { 0, 0 } } },
         { NOTMUCH_OPT_KEYWORD, &exclude, "exclude", 'x',
           (notmuch_keyword_t []){ { "true", NOTMUCH_EXCLUDE_TRUE },
diff --git a/test/T090-search-output.sh b/test/T090-search-output.sh
index 841a721..5a9bbc9 100755
--- a/test/T090-search-output.sh
+++ b/test/T090-search-output.sh
@@ -433,6 +433,56 @@ cat <<EOF >EXPECTED
 EOF
 test_expect_equal_file OUTPUT EXPECTED
 
+test_begin_subtest "--output=sender --output=count"
+notmuch search --output=sender --output=count '*' | sort -n >OUTPUT
+cat <<EOF >EXPECTED
+1	Adrian Perez de Castro <aperez@igalia.com>
+1	Aron Griffis <agriffis@n01se.net>
+1	Chris Wilson <chris@chris-wilson.co.uk>
+1	François Boulogne <boulogne.f@gmail.com>
+1	Ingmar Vanhassel <ingmar@exherbo.org>
+1	Israel Herraiz <isra@herraiz.org>
+1	Olivier Berger <olivier.berger@it-sudparis.eu>
+1	Rolland Santimano <rollandsantimano@yahoo.com>
+2	Alex Botero-Lowry <alex.boterolowry@gmail.com>
+2	Jjgod Jiang <gzjjgod@gmail.com>
+3	Stewart Smith <stewart@flamingspork.com>
+4	Alexander Botero-Lowry <alex.boterolowry@gmail.com>
+4	Jan Janak <jan@ryngle.com>
+5	Lars Kellogg-Stedman <lars@seas.harvard.edu>
+5	Mikhail Gusarov <dottedmag@dottedmag.net>
+7	Keith Packard <keithp@keithp.com>
+12	Carl Worth <cworth@cworth.org>
+EOF
+test_expect_equal_file OUTPUT EXPECTED
+
+test_begin_subtest "--output=sender --output=count --format=json"
+# Since the iteration order of GHashTable is not specified, we
+# preprocess and sort the results to keep the order stable here.
+notmuch search --output=sender --output=count --format=json '*' | \
+    sed -e 's/^\[//' -e 's/]$//' -e 's/,$//' | \
+    sort --field-separator=":" --key=4n --key=2 >OUTPUT
+cat <<EOF >EXPECTED
+{"name": "Adrian Perez de Castro", "address": "aperez@igalia.com", "count": 1}
+{"name": "Aron Griffis", "address": "agriffis@n01se.net", "count": 1}
+{"name": "Chris Wilson", "address": "chris@chris-wilson.co.uk", "count": 1}
+{"name": "François Boulogne", "address": "boulogne.f@gmail.com", "count": 1}
+{"name": "Ingmar Vanhassel", "address": "ingmar@exherbo.org", "count": 1}
+{"name": "Israel Herraiz", "address": "isra@herraiz.org", "count": 1}
+{"name": "Olivier Berger", "address": "olivier.berger@it-sudparis.eu", "count": 1}
+{"name": "Rolland Santimano", "address": "rollandsantimano@yahoo.com", "count": 1}
+{"name": "Alex Botero-Lowry", "address": "alex.boterolowry@gmail.com", "count": 2}
+{"name": "Jjgod Jiang", "address": "gzjjgod@gmail.com", "count": 2}
+{"name": "Stewart Smith", "address": "stewart@flamingspork.com", "count": 3}
+{"name": "Alexander Botero-Lowry", "address": "alex.boterolowry@gmail.com", "count": 4}
+{"name": "Jan Janak", "address": "jan@ryngle.com", "count": 4}
+{"name": "Lars Kellogg-Stedman", "address": "lars@seas.harvard.edu", "count": 5}
+{"name": "Mikhail Gusarov", "address": "dottedmag@dottedmag.net", "count": 5}
+{"name": "Keith Packard", "address": "keithp@keithp.com", "count": 7}
+{"name": "Carl Worth", "address": "cworth@cworth.org", "count": 12}
+EOF
+test_expect_equal_file OUTPUT EXPECTED
+
 test_begin_subtest "--output=recipients"
 notmuch search --output=recipients '*' >OUTPUT
 cat <<EOF >EXPECTED
-- 
2.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 7/7] cli: search: Add --filter-by option to configure address filtering
  2014-10-30 23:59 [PATCH v5 0/7] notmuch search --output=sender/recipients Michal Sojka
                   ` (5 preceding siblings ...)
  2014-10-30 23:59 ` [PATCH v5 6/7] cli: search: Add --output=count Michal Sojka
@ 2014-10-30 23:59 ` Michal Sojka
  2014-10-31  8:54 ` [PATCH v5 0/7] notmuch search --output=sender/recipients Mark Walters
  2014-10-31 15:03 ` Jesse Rosenthal
  8 siblings, 0 replies; 16+ messages in thread
From: Michal Sojka @ 2014-10-30 23:59 UTC (permalink / raw)
  To: notmuch

This option allows to configure the criterion for duplicate address
filtering. Without this option, all unique combinations of name and
address parts are printed. This option allows to filter the output
more, for example to only contain unique address parts.
---
 completion/notmuch-completion.bash |  6 +++-
 completion/notmuch-completion.zsh  |  3 +-
 doc/man1/notmuch-search.rst        | 39 ++++++++++++++++++++++-
 notmuch-search.c                   | 51 ++++++++++++++++++++++++++++--
 test/T095-search-filter-by.sh      | 64 ++++++++++++++++++++++++++++++++++++++
 5 files changed, 158 insertions(+), 5 deletions(-)
 create mode 100755 test/T095-search-filter-by.sh

diff --git a/completion/notmuch-completion.bash b/completion/notmuch-completion.bash
index 39cd829..b625b02 100644
--- a/completion/notmuch-completion.bash
+++ b/completion/notmuch-completion.bash
@@ -305,12 +305,16 @@ _notmuch_search()
 	    COMPREPLY=( $( compgen -W "true false flag all" -- "${cur}" ) )
 	    return
 	    ;;
+	--filter-by)
+	    COMPREPLY=( $( compgen -W "nameaddr name addr addrfold nameaddrfold" -- "${cur}" ) )
+	    return
+	    ;;
     esac
 
     ! $split &&
     case "${cur}" in
 	-*)
-	    local options="--format= --output= --sort= --offset= --limit= --exclude= --duplicate="
+	    local options="--format= --output= --sort= --offset= --limit= --exclude= --duplicate= --filter-by="
 	    compopt -o nospace
 	    COMPREPLY=( $(compgen -W "$options" -- ${cur}) )
 	    ;;
diff --git a/completion/notmuch-completion.zsh b/completion/notmuch-completion.zsh
index d7e5a5e..c1ccc32 100644
--- a/completion/notmuch-completion.zsh
+++ b/completion/notmuch-completion.zsh
@@ -53,7 +53,8 @@ _notmuch_search()
     '--max-threads=[display only the first x threads from the search results]:number of threads to show: ' \
     '--first=[omit the first x threads from the search results]:number of threads to omit: ' \
     '--sort=[sort results]:sorting:((newest-first\:"reverse chronological order" oldest-first\:"chronological order"))' \
-    '--output=[select what to output]:output:((summary threads messages files tags sender recipients count))'
+    '--output=[select what to output]:output:((summary threads messages files tags sender recipients count))' \
+    '--filter-by=[filter out duplicate addresses]:filter-by:((nameaddr\:"both name and address part" name\:"name part" addr\:"address part" addrfold\:"case-insensitive address part" nameaddrfold\:"name and case-insensitive address part"))'
 }
 
 _notmuch()
diff --git a/doc/man1/notmuch-search.rst b/doc/man1/notmuch-search.rst
index ec89200..3a5556b 100644
--- a/doc/man1/notmuch-search.rst
+++ b/doc/man1/notmuch-search.rst
@@ -85,7 +85,8 @@ Supported options for **search** include
             (--format=text0), as a JSON array (--format=json), or as
             an S-Expression list (--format=sexp).
 
-            Duplicate addresses are filtered out.
+            Duplicate addresses are filtered out. Filtering can be
+            configured with the --filter-by option.
 
 	    Note: Searching for **sender** should be much faster than
 	    searching for **recipients**, because sender addresses are
@@ -158,6 +159,42 @@ Supported options for **search** include
         prefix. The prefix matches messages based on filenames. This
         option filters filenames of the matching messages.
 
+    ``--filter-by=``\ (**nameaddr**\ \|\ **name** \|\ **addr**\ \|\ **addrfold**\ \|\ **nameaddrfold**\)
+
+	Can be used with ``--output=sender`` or
+	``--output=recipients`` to filter out duplicate addresses. The
+	filtering algorithm receives a sequence of email addresses and
+	outputs the same sequence without the addresses that are
+	considered a duplicate of a previously output address. What is
+	considered a duplicate depends on how the two addresses are
+	compared and this can be controlled with the following
+	keywords:
+
+	**nameaddr** means that both name and address parts are
+	compared in case-sensitive manner. Therefore, all same looking
+	addresses strings are considered duplicate. This is the
+	default.
+
+	**name** means that only the name part is compared (in
+	case-sensitive manner). For example, the addresses "John Doe
+	<me@example.com>" and "John Doe <john@doe.name>" will be
+	considered duplicate.
+
+	**addr** means that only the address part is compared (in
+	case-sensitive manner). For example, the addresses "John Doe
+	<john@example.com>" and "Dr. John Doe <john@example.com>" will
+	be considered duplicate.
+
+	**addrfold** is like **addr**, but comparison is done in
+	canse-insensitive manner. For example, the addresses "John Doe
+	<john@example.com>" and "Dr. John Doe <JOHN@EXAMPLE.COM>" will
+	be considered duplicate.
+
+	**nameaddrfold** is like **nameaddr**, but address comparison
+	is done in canse-insensitive manner. For example, the
+	addresses "John Doe <john@example.com>" and "John Doe
+	<JOHN@EXAMPLE.COM>" will be considered duplicate.
+
 EXIT STATUS
 ===========
 
diff --git a/notmuch-search.c b/notmuch-search.c
index 15527c4..8bc80d3 100644
--- a/notmuch-search.c
+++ b/notmuch-search.c
@@ -35,6 +35,14 @@ typedef enum {
 
 #define OUTPUT_ADDRESS_FLAGS (OUTPUT_SENDER | OUTPUT_RECIPIENTS | OUTPUT_COUNT)
 
+typedef enum {
+    FILTER_BY_NAMEADDR = 0,
+    FILTER_BY_NAME,
+    FILTER_BY_ADDR,
+    FILTER_BY_ADDRFOLD,
+    FILTER_BY_NAMEADDRFOLD,
+} filter_by_t;
+
 typedef struct {
     sprinter_t *format;
     notmuch_query_t *query;
@@ -43,6 +51,7 @@ typedef struct {
     int offset;
     int limit;
     int dupe;
+    filter_by_t filter_by;
 } search_options_t;
 
 typedef struct {
@@ -231,7 +240,7 @@ do_search_threads (search_options_t *opt)
     return 0;
 }
 
-/* Returns TRUE iff name and addr is duplicate. */
+/* Returns TRUE iff name and/or addr is considered duplicate. */
 static notmuch_bool_t
 is_duplicate (const search_options_t *opt, GHashTable *addrs, const char *name, const char *addr)
 {
@@ -239,7 +248,32 @@ is_duplicate (const search_options_t *opt, GHashTable *addrs, const char *name,
     char *key;
     mailbox_t *mailbox;
 
-    key = talloc_asprintf (opt->format, "%s <%s>", name, addr);
+    if (opt->filter_by == FILTER_BY_ADDRFOLD ||
+	opt->filter_by == FILTER_BY_NAMEADDRFOLD) {
+	gchar *folded = g_utf8_casefold (addr, -1);
+	addr = talloc_strdup (opt->format, folded);
+	g_free (folded);
+    }
+    switch (opt->filter_by) {
+    case FILTER_BY_NAMEADDR:
+    case FILTER_BY_NAMEADDRFOLD:
+	key = talloc_asprintf (opt->format, "%s <%s>", name, addr);
+	break;
+    case FILTER_BY_NAME:
+	key = talloc_strdup (opt->format, name); /* !name results in !key */
+	break;
+    case FILTER_BY_ADDR:
+    case FILTER_BY_ADDRFOLD:
+	key = talloc_strdup (opt->format, addr);
+	break;
+    default:
+	INTERNAL_ERROR("invalid --filter-by flags");
+    }
+
+    if (opt->filter_by == FILTER_BY_ADDRFOLD ||
+	opt->filter_by == FILTER_BY_NAMEADDRFOLD)
+	talloc_free ((char*)addr);
+
     if (! key)
 	return FALSE;
 
@@ -523,6 +557,7 @@ notmuch_search_command (notmuch_config_t *config, int argc, char *argv[])
 	.offset = 0,
 	.limit = -1, /* unlimited */
 	.dupe = -1,
+	.filter_by = FILTER_BY_NAMEADDR,
     };
     char *query_str;
     int opt_index, ret;
@@ -567,6 +602,13 @@ notmuch_search_command (notmuch_config_t *config, int argc, char *argv[])
 	{ NOTMUCH_OPT_INT, &opt.offset, "offset", 'O', 0 },
 	{ NOTMUCH_OPT_INT, &opt.limit, "limit", 'L', 0  },
 	{ NOTMUCH_OPT_INT, &opt.dupe, "duplicate", 'D', 0  },
+	{ NOTMUCH_OPT_KEYWORD, &opt.filter_by, "filter-by", 'b',
+	  (notmuch_keyword_t []){ { "nameaddr", FILTER_BY_NAMEADDR },
+				  { "name", FILTER_BY_NAME },
+				  { "addr", FILTER_BY_ADDR },
+				  { "addrfold", FILTER_BY_ADDRFOLD },
+				  { "nameaddrfold", FILTER_BY_NAMEADDRFOLD },
+				  { 0, 0 } } },
 	{ 0, 0, 0, 0, 0 }
     };
 
@@ -577,6 +619,11 @@ notmuch_search_command (notmuch_config_t *config, int argc, char *argv[])
     if (! opt.output)
 	opt.output = OUTPUT_SUMMARY;
 
+    if (opt.filter_by && !(opt.output & OUTPUT_ADDRESS_FLAGS)) {
+	fprintf (stderr, "Error: --filter-by can only be used with address output.\n");
+	return EXIT_FAILURE;
+    }
+
     switch (format_sel) {
     case NOTMUCH_FORMAT_TEXT:
 	opt.format = sprinter_text_create (config, stdout);
diff --git a/test/T095-search-filter-by.sh b/test/T095-search-filter-by.sh
new file mode 100755
index 0000000..97d9a9b
--- /dev/null
+++ b/test/T095-search-filter-by.sh
@@ -0,0 +1,64 @@
+#!/usr/bin/env bash
+test_description='duplicite address filtering in "notmuch search --output=recipients"'
+. ./test-lib.sh
+
+add_message '[to]="Real Name <foo@example.com>, Real Name <bar@example.com>"'
+add_message '[to]="Nickname <foo@example.com>"' '[cc]="Real Name <Bar@Example.COM>"'
+add_message '[to]="Nickname <foo@example.com>"' '[bcc]="Real Name <Bar@Example.COM>"'
+
+test_begin_subtest "--output=recipients"
+notmuch search --output=recipients "*" >OUTPUT
+cat <<EOF >EXPECTED
+Real Name <foo@example.com>
+Real Name <bar@example.com>
+Nickname <foo@example.com>
+Real Name <Bar@Example.COM>
+EOF
+test_expect_equal_file OUTPUT EXPECTED
+
+test_begin_subtest "--output=recipients --filter-by=nameaddr"
+notmuch search --output=recipients --filter-by=nameaddr "*" >OUTPUT
+# The same as above
+cat <<EOF >EXPECTED
+Real Name <foo@example.com>
+Real Name <bar@example.com>
+Nickname <foo@example.com>
+Real Name <Bar@Example.COM>
+EOF
+test_expect_equal_file OUTPUT EXPECTED
+
+test_begin_subtest "--output=recipients --filter-by=name"
+notmuch search --output=recipients --filter-by=name "*" >OUTPUT
+cat <<EOF >EXPECTED
+Real Name <foo@example.com>
+Nickname <foo@example.com>
+EOF
+test_expect_equal_file OUTPUT EXPECTED
+
+test_begin_subtest "--output=recipients --filter-by=addr"
+notmuch search --output=recipients --filter-by=addr "*" >OUTPUT
+cat <<EOF >EXPECTED
+Real Name <foo@example.com>
+Real Name <bar@example.com>
+Real Name <Bar@Example.COM>
+EOF
+test_expect_equal_file OUTPUT EXPECTED
+
+test_begin_subtest "--output=recipients --filter-by=addrfold"
+notmuch search --output=recipients --filter-by=addrfold "*" >OUTPUT
+cat <<EOF >EXPECTED
+Real Name <foo@example.com>
+Real Name <bar@example.com>
+EOF
+test_expect_equal_file OUTPUT EXPECTED
+
+test_begin_subtest "--output=recipients --filter-by=nameaddrfold"
+notmuch search --output=recipients --filter-by=nameaddrfold "*" >OUTPUT
+cat <<EOF >EXPECTED
+Real Name <foo@example.com>
+Real Name <bar@example.com>
+Nickname <foo@example.com>
+EOF
+test_expect_equal_file OUTPUT EXPECTED
+
+test_done
-- 
2.1.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 0/7] notmuch search --output=sender/recipients
  2014-10-30 23:59 [PATCH v5 0/7] notmuch search --output=sender/recipients Michal Sojka
                   ` (6 preceding siblings ...)
  2014-10-30 23:59 ` [PATCH v5 7/7] cli: search: Add --filter-by option to configure address filtering Michal Sojka
@ 2014-10-31  8:54 ` Mark Walters
  2014-10-31 10:20   ` Mark Walters
  2014-10-31 15:03 ` Jesse Rosenthal
  8 siblings, 1 reply; 16+ messages in thread
From: Mark Walters @ 2014-10-31  8:54 UTC (permalink / raw)
  To: Michal Sojka, notmuch

On Thu, 30 Oct 2014, Michal Sojka <sojkam1@fel.cvut.cz> wrote:
> Hi all,
>
> this is v5 of the search --output=address series. It obsoletes v4
> (id:1414421455-3037-1-git-send-email-sojkam1@fel.cvut.cz).
>
> I addresses comments from Mark and Tomi. Based on the comments to v4
> and earlier versions, patches 1-4 should be ready for merging. Patch 5
> is a non-controversial part of the controversial --filter-by patch and
> could be probably merged after review.

I have looked at Patches 1-5 and tested. These look good to me. +1

My only query is in the text output: should the name part be printed as
a quoted string. For example currently I get a line of the form

Bloggs, Fred <fred@example.com>

and I think in theory I could have a real name 

"Fred <stupid> Bloggs" which this would print without the quotes.

For the other formats it is much less of a problem because the name and
address are clearly separated.

I am happy with an answer of the form "for robust parseable results use a
structured format" which is what we say for search for example.

I just thought I would mention it in case you thought the quoted form
was more useful for consumers.

Best wishes

Mark


>
> Patch 6 needs at least a review and patch 7 needs more discussion.
>
> Changes from v4:
>
> - patch changed to commit in commit messages
> - opt->format changed to format
> - Added comments to process_* functions
> - duplicite changed to duplicate
> - check_duplicate changed to is_duplicate
> - Deduplication was split into two commits: basic deduplication
>   without a command line option and configurable deduplication with
>   --fiter-by.
>
> Changes from v3:
>
> - `o' renamed to `opt'.
> - Conversion of --output from keyword to keyword-flags is now a
>   separate patch.
> - Structured output formats print name and address separately.
> - Added test for --format=json.
> - Changed --filter-by default to nameaddr. In v2, the default was
>   addrfold, in v3 the default was no filtering at all. I believe that
>   Mark's suggestion to make nameaddr the default is good trade off.
> - Added new --output=count
> - Minor style fixes
> - Few typos fixed
> - There is no way to output unfiltered (duplicite) addresses.
>   Hopefully, the introduction of --output=count is sufficient
>   replacement for this "feature".
>
> Cheers,
> -Michal
>
>
> Jani Nikula (1):
>   cli: Add support for parsing keyword-flag arguments
>
> Michal Sojka (6):
>   cli: search: Refactor passing of command line options
>   cli: search: Convert --output to keyword-flag argument
>   cli: search: Add --output={sender,recipients}
>   cli: search: Do not output duplicate addresses
>   cli: search: Add --output=count
>   cli: search: Add --filter-by option to configure address filtering
>
>  command-line-arguments.c           |   6 +-
>  command-line-arguments.h           |   1 +
>  completion/notmuch-completion.bash |   8 +-
>  completion/notmuch-completion.zsh  |   4 +-
>  doc/man1/notmuch-search.rst        |  66 ++++++-
>  notmuch-search.c                   | 388 +++++++++++++++++++++++++++++--------
>  test/T090-search-output.sh         | 137 +++++++++++++
>  test/T095-search-filter-by.sh      |  64 ++++++
>  test/T410-argument-parsing.sh      |   3 +-
>  test/arg-test.c                    |   9 +
>  10 files changed, 604 insertions(+), 82 deletions(-)
>  create mode 100755 test/T095-search-filter-by.sh
>
> -- 
> 2.1.1
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 0/7] notmuch search --output=sender/recipients
  2014-10-31  8:54 ` [PATCH v5 0/7] notmuch search --output=sender/recipients Mark Walters
@ 2014-10-31 10:20   ` Mark Walters
  2014-10-31 15:32     ` Michal Sojka
  0 siblings, 1 reply; 16+ messages in thread
From: Mark Walters @ 2014-10-31 10:20 UTC (permalink / raw)
  To: Michal Sojka, notmuch

[-- Attachment #1: Type: text/plain, Size: 472 bytes --]


Hi

I attach a patch which does the quoting for real names but I don't know
if we want it: it changes (example taken from the test suite)

François Boulogne to

=?iso-8859-1?q?Fran=E7ois?= Boulogne

(which is what was in the original email)

Plausibly the best thing is just to leave the series as is, so the
text output is readable and parseable in the common cases.

Anyway the patch is attached if anyone wants to experiment.

Best wishes

Mark


[-- Attachment #2: 0001-search-quote-real-names-for-output-sender-recipient-.patch --]
[-- Type: text/x-diff, Size: 3524 bytes --]

From 53b1ced2d6a9fbbba93448325f795e6b99faa240 Mon Sep 17 00:00:00 2001
From: Mark Walters <markwalters1009@gmail.com>
Date: Fri, 31 Oct 2014 10:11:40 +0000
Subject: [PATCH] search: quote real names for output=sender/recipient in text
 format

This quotes the real name (when gmime thinks appropriate) for the text
output. For the other outputs the real name is separate from the
address so the consumer can do any quoting necessary.
---
 notmuch-search.c           |    8 ++++----
 test/T090-search-output.sh |    8 ++++----
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/notmuch-search.c b/notmuch-search.c
index eae749a..8eac161 100644
--- a/notmuch-search.c
+++ b/notmuch-search.c
@@ -47,6 +47,7 @@ typedef struct {
 typedef struct {
     const char *name;
     const char *addr;
+    const char *string;
 } mailbox_t;
 
 /* Return two stable query strings that identify exactly the matched
@@ -255,15 +256,13 @@ print_mailbox (const search_options_t *opt, const mailbox_t *mailbox)
 {
     const char *name = mailbox->name;
     const char *addr = mailbox->addr;
+    const char *string = mailbox->string;
     sprinter_t *format = opt->format;
 
     if (format->is_text_printer) {
 	char *mailbox_str;
 
-	if (name && *name)
-	    mailbox_str = talloc_asprintf (format, "%s <%s>", name, addr);
-	else
-	    mailbox_str = talloc_strdup (format, addr);
+	mailbox_str = talloc_strdup (format, string);
 
 	if (! mailbox_str) {
 	    fprintf (stderr, "Error: out of memory\n");
@@ -309,6 +308,7 @@ process_address_list (const search_options_t *opt, GHashTable *addrs,
 	    mailbox_t mbx = {
 		.name = internet_address_get_name (address),
 		.addr = internet_address_mailbox_get_addr (mailbox),
+		.string = internet_address_to_string (address, TRUE),
 	    };
 
 	    if (is_duplicate (opt, addrs, mbx.name, mbx.addr))
diff --git a/test/T090-search-output.sh b/test/T090-search-output.sh
index 841a721..776e3f4 100755
--- a/test/T090-search-output.sh
+++ b/test/T090-search-output.sh
@@ -390,7 +390,7 @@ test_expect_equal_file OUTPUT EXPECTED
 test_begin_subtest "--output=sender"
 notmuch search --output=sender '*' >OUTPUT
 cat <<EOF >EXPECTED
-François Boulogne <boulogne.f@gmail.com>
+=?iso-8859-1?q?Fran=E7ois?= Boulogne <boulogne.f@gmail.com>
 Olivier Berger <olivier.berger@it-sudparis.eu>
 Chris Wilson <chris@chris-wilson.co.uk>
 Carl Worth <cworth@cworth.org>
@@ -437,7 +437,7 @@ test_begin_subtest "--output=recipients"
 notmuch search --output=recipients '*' >OUTPUT
 cat <<EOF >EXPECTED
 Allan McRae <allan@archlinux.org>
-Discussion about the Arch User Repository (AUR) <aur-general@archlinux.org>
+"Discussion about the Arch User Repository (AUR)" <aur-general@archlinux.org>
 olivier.berger@it-sudparis.eu
 notmuch@notmuchmail.org
 notmuch <notmuch@notmuchmail.org>
@@ -449,9 +449,9 @@ test_expect_equal_file OUTPUT EXPECTED
 test_begin_subtest "--output=sender --output=recipients"
 notmuch search --output=sender --output=recipients '*' >OUTPUT
 cat <<EOF >EXPECTED
-François Boulogne <boulogne.f@gmail.com>
+=?iso-8859-1?q?Fran=E7ois?= Boulogne <boulogne.f@gmail.com>
 Allan McRae <allan@archlinux.org>
-Discussion about the Arch User Repository (AUR) <aur-general@archlinux.org>
+"Discussion about the Arch User Repository (AUR)" <aur-general@archlinux.org>
 Olivier Berger <olivier.berger@it-sudparis.eu>
 olivier.berger@it-sudparis.eu
 Chris Wilson <chris@chris-wilson.co.uk>
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 0/7] notmuch search --output=sender/recipients
  2014-10-30 23:59 [PATCH v5 0/7] notmuch search --output=sender/recipients Michal Sojka
                   ` (7 preceding siblings ...)
  2014-10-31  8:54 ` [PATCH v5 0/7] notmuch search --output=sender/recipients Mark Walters
@ 2014-10-31 15:03 ` Jesse Rosenthal
  2014-10-31 15:43   ` Michal Sojka
  8 siblings, 1 reply; 16+ messages in thread
From: Jesse Rosenthal @ 2014-10-31 15:03 UTC (permalink / raw)
  To: Michal Sojka, notmuch

Dear Michal,

Thanks for all this. The feature looks great! One issue: I'm getting
corrupt output when using `--output=count` with "fold" filters:

~~~
jkr@ladybug [11:01AM] ~ $ notmuch search --output=recipients --output=count "tag:notmuch and date:today.." 
2	Jani Nikula <jani@nikula.org>
2	Michal Sojka <sojkam1@fel.cvut.cz>
1	David Edmondson <dme@dme.org>
9	notmuch@notmuchmail.org

jkr@ladybug [11:01AM] ~ $ notmuch search --output=recipients --output=count --filter-by=addrfold "tag:notmuch and date:today.."
2	Michal Sojka <0Fam1@fel.cvutH>
9	
1	David Edmondson <8dme.org>
2	Jani Nikula <jani@nikula.org>

jkr@ladybug [11:01AM] ~ $ notmuch search --output=recipients --output=count --filter-by=nameaddr "tag:notmuch and date:today.."
2	Jani Nikula <jani@nikula.org>
2	Michal Sojka <sojkam1@fel.cvut.cz>
1	David Edmondson <dme@dme.org>
9	notmuch@notmuchmail.org

jkr@ladybug [11:01AM] ~ $ notmuch search --output=recipients --output=count --filter-by=nameaddrfold "tag:notmuch and date:today.."
2	Jani Nikula <jani@nikula.org>
2	Michal Sojka <0V�	am1@fel.cvutH>
1	David Edmondson <8��	dme.org>
9
~~~

Not sure if this is expected behavior with the filtering mechanism in
its current state, but I wanted to call it to your attention if not.

Best,
Jesse

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 0/7] notmuch search --output=sender/recipients
  2014-10-31 10:20   ` Mark Walters
@ 2014-10-31 15:32     ` Michal Sojka
  2014-10-31 16:46       ` Tomi Ollila
  2014-10-31 16:47       ` Mark Walters
  0 siblings, 2 replies; 16+ messages in thread
From: Michal Sojka @ 2014-10-31 15:32 UTC (permalink / raw)
  To: Mark Walters, notmuch

On Fri, Oct 31 2014, Mark Walters wrote:
> My only query is in the text output: should the name part be printed as
> a quoted string. For example currently I get a line of the form
>
> Bloggs, Fred <fred@example.com>

Good point.

On Fri, Oct 31 2014, Mark Walters wrote:
> Hi
>
> I attach a patch which does the quoting for real names but I don't know
> if we want it: it changes (example taken from the test suite)
>
> François Boulogne to
>
> =?iso-8859-1?q?Fran=E7ois?= Boulogne
>
> (which is what was in the original email)
>
> Plausibly the best thing is just to leave the series as is, so the
> text output is readable and parseable in the common cases.
>
> Anyway the patch is attached if anyone wants to experiment.
>
> Best wishes
>
> Mark
>
> From 53b1ced2d6a9fbbba93448325f795e6b99faa240 Mon Sep 17 00:00:00 2001
> From: Mark Walters <markwalters1009@gmail.com>
> Date: Fri, 31 Oct 2014 10:11:40 +0000
> Subject: [PATCH] search: quote real names for output=sender/recipient in text
>  format
>
> This quotes the real name (when gmime thinks appropriate) for the text
> output. For the other outputs the real name is separate from the
> address so the consumer can do any quoting necessary.
> ---
>  notmuch-search.c           |    8 ++++----
>  test/T090-search-output.sh |    8 ++++----
>  2 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/notmuch-search.c b/notmuch-search.c
> index eae749a..8eac161 100644
> --- a/notmuch-search.c
> +++ b/notmuch-search.c
> @@ -47,6 +47,7 @@ typedef struct {
>  typedef struct {
>      const char *name;
>      const char *addr;
> +    const char *string;
>  } mailbox_t;
>  
>  /* Return two stable query strings that identify exactly the matched
> @@ -255,15 +256,13 @@ print_mailbox (const search_options_t *opt, const mailbox_t *mailbox)
>  {
>      const char *name = mailbox->name;
>      const char *addr = mailbox->addr;
> +    const char *string = mailbox->string;
>      sprinter_t *format = opt->format;
>  
>      if (format->is_text_printer) {
>  	char *mailbox_str;
>  
> -	if (name && *name)
> -	    mailbox_str = talloc_asprintf (format, "%s <%s>", name, addr);
> -	else
> -	    mailbox_str = talloc_strdup (format, addr);
> +	mailbox_str = talloc_strdup (format, string);
>  
>  	if (! mailbox_str) {
>  	    fprintf (stderr, "Error: out of memory\n");
> @@ -309,6 +308,7 @@ process_address_list (const search_options_t *opt, GHashTable *addrs,
>  	    mailbox_t mbx = {
>  		.name = internet_address_get_name (address),
>  		.addr = internet_address_mailbox_get_addr (mailbox),
> +		.string = internet_address_to_string (address, TRUE),

I'd prefer having the second parameter (encode) FALSE. This will still
add quotes when necessary, but does not encode non-ascii characters so
the result would be human readable.

Another question is whether to add .string to mailbox_t. In this patch
it doesn't matter, but if --output=count patch will be merged, this
would mean that memory consumption doubles, because with --output=count
the addresses are kept in memory and printed only after the search is
completed. It would be therefore better to construct a new
InternetAddressMailbox from name and addr in print_mailbox() and perform
the conversion to string there. Thoughts?

Thanks,
-Michal

>  	    };
>  
>  	    if (is_duplicate (opt, addrs, mbx.name, mbx.addr))
> diff --git a/test/T090-search-output.sh b/test/T090-search-output.sh
> index 841a721..776e3f4 100755
> --- a/test/T090-search-output.sh
> +++ b/test/T090-search-output.sh
> @@ -390,7 +390,7 @@ test_expect_equal_file OUTPUT EXPECTED
>  test_begin_subtest "--output=sender"
>  notmuch search --output=sender '*' >OUTPUT
>  cat <<EOF >EXPECTED
> -François Boulogne <boulogne.f@gmail.com>
> +=?iso-8859-1?q?Fran=E7ois?= Boulogne <boulogne.f@gmail.com>
>  Olivier Berger <olivier.berger@it-sudparis.eu>
>  Chris Wilson <chris@chris-wilson.co.uk>
>  Carl Worth <cworth@cworth.org>
> @@ -437,7 +437,7 @@ test_begin_subtest "--output=recipients"
>  notmuch search --output=recipients '*' >OUTPUT
>  cat <<EOF >EXPECTED
>  Allan McRae <allan@archlinux.org>
> -Discussion about the Arch User Repository (AUR) <aur-general@archlinux.org>
> +"Discussion about the Arch User Repository (AUR)" <aur-general@archlinux.org>
>  olivier.berger@it-sudparis.eu
>  notmuch@notmuchmail.org
>  notmuch <notmuch@notmuchmail.org>
> @@ -449,9 +449,9 @@ test_expect_equal_file OUTPUT EXPECTED
>  test_begin_subtest "--output=sender --output=recipients"
>  notmuch search --output=sender --output=recipients '*' >OUTPUT
>  cat <<EOF >EXPECTED
> -François Boulogne <boulogne.f@gmail.com>
> +=?iso-8859-1?q?Fran=E7ois?= Boulogne <boulogne.f@gmail.com>
>  Allan McRae <allan@archlinux.org>
> -Discussion about the Arch User Repository (AUR) <aur-general@archlinux.org>
> +"Discussion about the Arch User Repository (AUR)" <aur-general@archlinux.org>
>  Olivier Berger <olivier.berger@it-sudparis.eu>
>  olivier.berger@it-sudparis.eu
>  Chris Wilson <chris@chris-wilson.co.uk>
> -- 
> 1.7.10.4

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 0/7] notmuch search --output=sender/recipients
  2014-10-31 15:03 ` Jesse Rosenthal
@ 2014-10-31 15:43   ` Michal Sojka
  0 siblings, 0 replies; 16+ messages in thread
From: Michal Sojka @ 2014-10-31 15:43 UTC (permalink / raw)
  To: Jesse Rosenthal, notmuch

On Fri, Oct 31 2014, Jesse Rosenthal wrote:
> Dear Michal,
>
> Thanks for all this. The feature looks great! One issue: I'm getting
> corrupt output when using `--output=count` with "fold" filters:
>
> ~~~
> jkr@ladybug [11:01AM] ~ $ notmuch search --output=recipients --output=count "tag:notmuch and date:today.." 
> 2	Jani Nikula <jani@nikula.org>
> 2	Michal Sojka <sojkam1@fel.cvut.cz>
> 1	David Edmondson <dme@dme.org>
> 9	notmuch@notmuchmail.org
>
> jkr@ladybug [11:01AM] ~ $ notmuch search --output=recipients --output=count --filter-by=addrfold "tag:notmuch and date:today.."
> 2	Michal Sojka <0Fam1@fel.cvutH>
> 9	
> 1	David Edmondson <8dme.org>
> 2	Jani Nikula <jani@nikula.org>
>
> jkr@ladybug [11:01AM] ~ $ notmuch search --output=recipients --output=count --filter-by=nameaddr "tag:notmuch and date:today.."
> 2	Jani Nikula <jani@nikula.org>
> 2	Michal Sojka <sojkam1@fel.cvut.cz>
> 1	David Edmondson <dme@dme.org>
> 9	notmuch@notmuchmail.org
>
> jkr@ladybug [11:01AM] ~ $ notmuch search --output=recipients --output=count --filter-by=nameaddrfold "tag:notmuch and date:today.."
> 2	Jani Nikula <jani@nikula.org>
> 2	Michal Sojka <0V�	am1@fel.cvutH>
> 1	David Edmondson <8��	dme.org>
> 9
> ~~~
>
> Not sure if this is expected behavior with the filtering mechanism in
> its current state, but I wanted to call it to your attention if not.

Definitely not. I'll look into that. Thanks.

-Michal

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 0/7] notmuch search --output=sender/recipients
  2014-10-31 15:32     ` Michal Sojka
@ 2014-10-31 16:46       ` Tomi Ollila
  2014-10-31 16:54         ` Mark Walters
  2014-10-31 16:47       ` Mark Walters
  1 sibling, 1 reply; 16+ messages in thread
From: Tomi Ollila @ 2014-10-31 16:46 UTC (permalink / raw)
  To: Michal Sojka, Mark Walters, notmuch

On Fri, Oct 31 2014, Michal Sojka <sojkam1@fel.cvut.cz> wrote:

> On Fri, Oct 31 2014, Mark Walters wrote:
>> My only query is in the text output: should the name part be printed as
>> a quoted string. For example currently I get a line of the form
>>
>> Bloggs, Fred <fred@example.com>
>
> Good point.

There has been some discussion on this issue on IRC channel, and the
opinion of most (seen so far) is that output the parts without quoting
(i.e. just like done in this patch series)...

Taken the other example from Mark's earlyer email:

Bloggs <the king>, Fred <fred@example.com>

echo '^^^' | sed 's/.*</</' would leave only the address part (with <>:s) (*)

echo '^^^' | sed 's/ <[^<]*$//' would leave only the name part

and regexp '\(.*\) <\(.*\)>' or pcre-compatible /(.*)\s<(.*)>/
would capture name & address parts...

(all of the above untested, though;)

In case instead of 'name <addr>' there is only 'addr', then above
sed lines return the same (full) string and the regexps just don't
match.

There were some suggestions how the text output could be changed on IRC;
if anyone wishes to bring those forward, please do so :D

So, IMO, this issue is not showstopper in this series (if anything is);
patches 1-6 looks good to me on paper, but I have not tested those yet.

Tomi

(*) echo '^^^' | sed 's/.*<//; s/>.*//' would drop <>'s from first example


>
> On Fri, Oct 31 2014, Mark Walters wrote:
>> Hi
>>
>> I attach a patch which does the quoting for real names but I don't know
>> if we want it: it changes (example taken from the test suite)
>>
>> François Boulogne to
>>
>> =?iso-8859-1?q?Fran=E7ois?= Boulogne
>>
>> (which is what was in the original email)
>>
>> Plausibly the best thing is just to leave the series as is, so the
>> text output is readable and parseable in the common cases.
>>
>> Anyway the patch is attached if anyone wants to experiment.
>>
>> Best wishes
>>
>> Mark
>>
>> From 53b1ced2d6a9fbbba93448325f795e6b99faa240 Mon Sep 17 00:00:00 2001
>> From: Mark Walters <markwalters1009@gmail.com>
>> Date: Fri, 31 Oct 2014 10:11:40 +0000
>> Subject: [PATCH] search: quote real names for output=sender/recipient in text
>>  format
>>
>> This quotes the real name (when gmime thinks appropriate) for the text
>> output. For the other outputs the real name is separate from the
>> address so the consumer can do any quoting necessary.
>> ---
>>  notmuch-search.c           |    8 ++++----
>>  test/T090-search-output.sh |    8 ++++----
>>  2 files changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/notmuch-search.c b/notmuch-search.c
>> index eae749a..8eac161 100644
>> --- a/notmuch-search.c
>> +++ b/notmuch-search.c
>> @@ -47,6 +47,7 @@ typedef struct {
>>  typedef struct {
>>      const char *name;
>>      const char *addr;
>> +    const char *string;
>>  } mailbox_t;
>>  
>>  /* Return two stable query strings that identify exactly the matched
>> @@ -255,15 +256,13 @@ print_mailbox (const search_options_t *opt, const mailbox_t *mailbox)
>>  {
>>      const char *name = mailbox->name;
>>      const char *addr = mailbox->addr;
>> +    const char *string = mailbox->string;
>>      sprinter_t *format = opt->format;
>>  
>>      if (format->is_text_printer) {
>>  	char *mailbox_str;
>>  
>> -	if (name && *name)
>> -	    mailbox_str = talloc_asprintf (format, "%s <%s>", name, addr);
>> -	else
>> -	    mailbox_str = talloc_strdup (format, addr);
>> +	mailbox_str = talloc_strdup (format, string);
>>  
>>  	if (! mailbox_str) {
>>  	    fprintf (stderr, "Error: out of memory\n");
>> @@ -309,6 +308,7 @@ process_address_list (const search_options_t *opt, GHashTable *addrs,
>>  	    mailbox_t mbx = {
>>  		.name = internet_address_get_name (address),
>>  		.addr = internet_address_mailbox_get_addr (mailbox),
>> +		.string = internet_address_to_string (address, TRUE),
>
> I'd prefer having the second parameter (encode) FALSE. This will still
> add quotes when necessary, but does not encode non-ascii characters so
> the result would be human readable.
>
> Another question is whether to add .string to mailbox_t. In this patch
> it doesn't matter, but if --output=count patch will be merged, this
> would mean that memory consumption doubles, because with --output=count
> the addresses are kept in memory and printed only after the search is
> completed. It would be therefore better to construct a new
> InternetAddressMailbox from name and addr in print_mailbox() and perform
> the conversion to string there. Thoughts?
>
> Thanks,
> -Michal
>
>>  	    };
>>  
>>  	    if (is_duplicate (opt, addrs, mbx.name, mbx.addr))
>> diff --git a/test/T090-search-output.sh b/test/T090-search-output.sh
>> index 841a721..776e3f4 100755
>> --- a/test/T090-search-output.sh
>> +++ b/test/T090-search-output.sh
>> @@ -390,7 +390,7 @@ test_expect_equal_file OUTPUT EXPECTED
>>  test_begin_subtest "--output=sender"
>>  notmuch search --output=sender '*' >OUTPUT
>>  cat <<EOF >EXPECTED
>> -François Boulogne <boulogne.f@gmail.com>
>> +=?iso-8859-1?q?Fran=E7ois?= Boulogne <boulogne.f@gmail.com>
>>  Olivier Berger <olivier.berger@it-sudparis.eu>
>>  Chris Wilson <chris@chris-wilson.co.uk>
>>  Carl Worth <cworth@cworth.org>
>> @@ -437,7 +437,7 @@ test_begin_subtest "--output=recipients"
>>  notmuch search --output=recipients '*' >OUTPUT
>>  cat <<EOF >EXPECTED
>>  Allan McRae <allan@archlinux.org>
>> -Discussion about the Arch User Repository (AUR) <aur-general@archlinux.org>
>> +"Discussion about the Arch User Repository (AUR)" <aur-general@archlinux.org>
>>  olivier.berger@it-sudparis.eu
>>  notmuch@notmuchmail.org
>>  notmuch <notmuch@notmuchmail.org>
>> @@ -449,9 +449,9 @@ test_expect_equal_file OUTPUT EXPECTED
>>  test_begin_subtest "--output=sender --output=recipients"
>>  notmuch search --output=sender --output=recipients '*' >OUTPUT
>>  cat <<EOF >EXPECTED
>> -François Boulogne <boulogne.f@gmail.com>
>> +=?iso-8859-1?q?Fran=E7ois?= Boulogne <boulogne.f@gmail.com>
>>  Allan McRae <allan@archlinux.org>
>> -Discussion about the Arch User Repository (AUR) <aur-general@archlinux.org>
>> +"Discussion about the Arch User Repository (AUR)" <aur-general@archlinux.org>
>>  Olivier Berger <olivier.berger@it-sudparis.eu>
>>  olivier.berger@it-sudparis.eu
>>  Chris Wilson <chris@chris-wilson.co.uk>
>> -- 
>> 1.7.10.4
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 0/7] notmuch search --output=sender/recipients
  2014-10-31 15:32     ` Michal Sojka
  2014-10-31 16:46       ` Tomi Ollila
@ 2014-10-31 16:47       ` Mark Walters
  1 sibling, 0 replies; 16+ messages in thread
From: Mark Walters @ 2014-10-31 16:47 UTC (permalink / raw)
  To: Michal Sojka, notmuch


Hi

On Fri, 31 Oct 2014, Michal Sojka <sojkam1@fel.cvut.cz> wrote:
> On Fri, Oct 31 2014, Mark Walters wrote:
>> My only query is in the text output: should the name part be printed as
>> a quoted string. For example currently I get a line of the form
>>
>> Bloggs, Fred <fred@example.com>
>
> Good point.
>
> On Fri, Oct 31 2014, Mark Walters wrote:
>> Hi
>>
>> I attach a patch which does the quoting for real names but I don't know
>> if we want it: it changes (example taken from the test suite)
>>
>> François Boulogne to
>>
>> =?iso-8859-1?q?Fran=E7ois?= Boulogne
>>
>> (which is what was in the original email)
>>
>> Plausibly the best thing is just to leave the series as is, so the
>> text output is readable and parseable in the common cases.
>>
>> Anyway the patch is attached if anyone wants to experiment.
>>
>> Best wishes
>>
>> Mark
>>
>> From 53b1ced2d6a9fbbba93448325f795e6b99faa240 Mon Sep 17 00:00:00 2001
>> From: Mark Walters <markwalters1009@gmail.com>
>> Date: Fri, 31 Oct 2014 10:11:40 +0000
>> Subject: [PATCH] search: quote real names for output=sender/recipient in text
>>  format
>>
>> This quotes the real name (when gmime thinks appropriate) for the text
>> output. For the other outputs the real name is separate from the
>> address so the consumer can do any quoting necessary.
>> ---
>>  notmuch-search.c           |    8 ++++----
>>  test/T090-search-output.sh |    8 ++++----
>>  2 files changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/notmuch-search.c b/notmuch-search.c
>> index eae749a..8eac161 100644
>> --- a/notmuch-search.c
>> +++ b/notmuch-search.c
>> @@ -47,6 +47,7 @@ typedef struct {
>>  typedef struct {
>>      const char *name;
>>      const char *addr;
>> +    const char *string;
>>  } mailbox_t;
>>  
>>  /* Return two stable query strings that identify exactly the matched
>> @@ -255,15 +256,13 @@ print_mailbox (const search_options_t *opt, const mailbox_t *mailbox)
>>  {
>>      const char *name = mailbox->name;
>>      const char *addr = mailbox->addr;
>> +    const char *string = mailbox->string;
>>      sprinter_t *format = opt->format;
>>  
>>      if (format->is_text_printer) {
>>  	char *mailbox_str;
>>  
>> -	if (name && *name)
>> -	    mailbox_str = talloc_asprintf (format, "%s <%s>", name, addr);
>> -	else
>> -	    mailbox_str = talloc_strdup (format, addr);
>> +	mailbox_str = talloc_strdup (format, string);
>>  
>>  	if (! mailbox_str) {
>>  	    fprintf (stderr, "Error: out of memory\n");
>> @@ -309,6 +308,7 @@ process_address_list (const search_options_t *opt, GHashTable *addrs,
>>  	    mailbox_t mbx = {
>>  		.name = internet_address_get_name (address),
>>  		.addr = internet_address_mailbox_get_addr (mailbox),
>> +		.string = internet_address_to_string (address, TRUE),
>
> I'd prefer having the second parameter (encode) FALSE. This will still
> add quotes when necessary, but does not encode non-ascii characters so
> the result would be human readable.

Yes that is clearly better.
>
> Another question is whether to add .string to mailbox_t. In this patch
> it doesn't matter, but if --output=count patch will be merged, this
> would mean that memory consumption doubles, because with --output=count
> the addresses are kept in memory and printed only after the search is
> completed. It would be therefore better to construct a new
> InternetAddressMailbox from name and addr in print_mailbox() and perform
> the conversion to string there. Thoughts?

Yes I agree here too. Are you happy doing a replacement patch or should
I?

Best wishes

Mark

>
> Thanks,
> -Michal
>
>>  	    };
>>  
>>  	    if (is_duplicate (opt, addrs, mbx.name, mbx.addr))
>> diff --git a/test/T090-search-output.sh b/test/T090-search-output.sh
>> index 841a721..776e3f4 100755
>> --- a/test/T090-search-output.sh
>> +++ b/test/T090-search-output.sh
>> @@ -390,7 +390,7 @@ test_expect_equal_file OUTPUT EXPECTED
>>  test_begin_subtest "--output=sender"
>>  notmuch search --output=sender '*' >OUTPUT
>>  cat <<EOF >EXPECTED
>> -François Boulogne <boulogne.f@gmail.com>
>> +=?iso-8859-1?q?Fran=E7ois?= Boulogne <boulogne.f@gmail.com>
>>  Olivier Berger <olivier.berger@it-sudparis.eu>
>>  Chris Wilson <chris@chris-wilson.co.uk>
>>  Carl Worth <cworth@cworth.org>
>> @@ -437,7 +437,7 @@ test_begin_subtest "--output=recipients"
>>  notmuch search --output=recipients '*' >OUTPUT
>>  cat <<EOF >EXPECTED
>>  Allan McRae <allan@archlinux.org>
>> -Discussion about the Arch User Repository (AUR) <aur-general@archlinux.org>
>> +"Discussion about the Arch User Repository (AUR)" <aur-general@archlinux.org>
>>  olivier.berger@it-sudparis.eu
>>  notmuch@notmuchmail.org
>>  notmuch <notmuch@notmuchmail.org>
>> @@ -449,9 +449,9 @@ test_expect_equal_file OUTPUT EXPECTED
>>  test_begin_subtest "--output=sender --output=recipients"
>>  notmuch search --output=sender --output=recipients '*' >OUTPUT
>>  cat <<EOF >EXPECTED
>> -François Boulogne <boulogne.f@gmail.com>
>> +=?iso-8859-1?q?Fran=E7ois?= Boulogne <boulogne.f@gmail.com>
>>  Allan McRae <allan@archlinux.org>
>> -Discussion about the Arch User Repository (AUR) <aur-general@archlinux.org>
>> +"Discussion about the Arch User Repository (AUR)" <aur-general@archlinux.org>
>>  Olivier Berger <olivier.berger@it-sudparis.eu>
>>  olivier.berger@it-sudparis.eu
>>  Chris Wilson <chris@chris-wilson.co.uk>
>> -- 
>> 1.7.10.4

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 0/7] notmuch search --output=sender/recipients
  2014-10-31 16:46       ` Tomi Ollila
@ 2014-10-31 16:54         ` Mark Walters
  0 siblings, 0 replies; 16+ messages in thread
From: Mark Walters @ 2014-10-31 16:54 UTC (permalink / raw)
  To: Tomi Ollila, Michal Sojka, notmuch


On Fri, 31 Oct 2014, Tomi Ollila <tomi.ollila@iki.fi> wrote:
> On Fri, Oct 31 2014, Michal Sojka <sojkam1@fel.cvut.cz> wrote:
>
>> On Fri, Oct 31 2014, Mark Walters wrote:
>>> My only query is in the text output: should the name part be printed as
>>> a quoted string. For example currently I get a line of the form
>>>
>>> Bloggs, Fred <fred@example.com>
>>
>> Good point.
>
> There has been some discussion on this issue on IRC channel, and the
> opinion of most (seen so far) is that output the parts without quoting
> (i.e. just like done in this patch series)...

Just to confirm I am happy either way with the quoting. I think it would
be good not to change the text format after it goes into mainline
though. 

Best wishes

Mark





>
> Taken the other example from Mark's earlyer email:
>
> Bloggs <the king>, Fred <fred@example.com>
>
> echo '^^^' | sed 's/.*</</' would leave only the address part (with <>:s) (*)
>
> echo '^^^' | sed 's/ <[^<]*$//' would leave only the name part
>
> and regexp '\(.*\) <\(.*\)>' or pcre-compatible /(.*)\s<(.*)>/
> would capture name & address parts...
>
> (all of the above untested, though;)
>
> In case instead of 'name <addr>' there is only 'addr', then above
> sed lines return the same (full) string and the regexps just don't
> match.
>
> There were some suggestions how the text output could be changed on IRC;
> if anyone wishes to bring those forward, please do so :D
>
> So, IMO, this issue is not showstopper in this series (if anything is);
> patches 1-6 looks good to me on paper, but I have not tested those yet.
>
> Tomi
>
> (*) echo '^^^' | sed 's/.*<//; s/>.*//' would drop <>'s from first example
>
>
>>
>> On Fri, Oct 31 2014, Mark Walters wrote:
>>> Hi
>>>
>>> I attach a patch which does the quoting for real names but I don't know
>>> if we want it: it changes (example taken from the test suite)
>>>
>>> François Boulogne to
>>>
>>> =?iso-8859-1?q?Fran=E7ois?= Boulogne
>>>
>>> (which is what was in the original email)
>>>
>>> Plausibly the best thing is just to leave the series as is, so the
>>> text output is readable and parseable in the common cases.
>>>
>>> Anyway the patch is attached if anyone wants to experiment.
>>>
>>> Best wishes
>>>
>>> Mark
>>>
>>> From 53b1ced2d6a9fbbba93448325f795e6b99faa240 Mon Sep 17 00:00:00 2001
>>> From: Mark Walters <markwalters1009@gmail.com>
>>> Date: Fri, 31 Oct 2014 10:11:40 +0000
>>> Subject: [PATCH] search: quote real names for output=sender/recipient in text
>>>  format
>>>
>>> This quotes the real name (when gmime thinks appropriate) for the text
>>> output. For the other outputs the real name is separate from the
>>> address so the consumer can do any quoting necessary.
>>> ---
>>>  notmuch-search.c           |    8 ++++----
>>>  test/T090-search-output.sh |    8 ++++----
>>>  2 files changed, 8 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/notmuch-search.c b/notmuch-search.c
>>> index eae749a..8eac161 100644
>>> --- a/notmuch-search.c
>>> +++ b/notmuch-search.c
>>> @@ -47,6 +47,7 @@ typedef struct {
>>>  typedef struct {
>>>      const char *name;
>>>      const char *addr;
>>> +    const char *string;
>>>  } mailbox_t;
>>>  
>>>  /* Return two stable query strings that identify exactly the matched
>>> @@ -255,15 +256,13 @@ print_mailbox (const search_options_t *opt, const mailbox_t *mailbox)
>>>  {
>>>      const char *name = mailbox->name;
>>>      const char *addr = mailbox->addr;
>>> +    const char *string = mailbox->string;
>>>      sprinter_t *format = opt->format;
>>>  
>>>      if (format->is_text_printer) {
>>>  	char *mailbox_str;
>>>  
>>> -	if (name && *name)
>>> -	    mailbox_str = talloc_asprintf (format, "%s <%s>", name, addr);
>>> -	else
>>> -	    mailbox_str = talloc_strdup (format, addr);
>>> +	mailbox_str = talloc_strdup (format, string);
>>>  
>>>  	if (! mailbox_str) {
>>>  	    fprintf (stderr, "Error: out of memory\n");
>>> @@ -309,6 +308,7 @@ process_address_list (const search_options_t *opt, GHashTable *addrs,
>>>  	    mailbox_t mbx = {
>>>  		.name = internet_address_get_name (address),
>>>  		.addr = internet_address_mailbox_get_addr (mailbox),
>>> +		.string = internet_address_to_string (address, TRUE),
>>
>> I'd prefer having the second parameter (encode) FALSE. This will still
>> add quotes when necessary, but does not encode non-ascii characters so
>> the result would be human readable.
>>
>> Another question is whether to add .string to mailbox_t. In this patch
>> it doesn't matter, but if --output=count patch will be merged, this
>> would mean that memory consumption doubles, because with --output=count
>> the addresses are kept in memory and printed only after the search is
>> completed. It would be therefore better to construct a new
>> InternetAddressMailbox from name and addr in print_mailbox() and perform
>> the conversion to string there. Thoughts?
>>
>> Thanks,
>> -Michal
>>
>>>  	    };
>>>  
>>>  	    if (is_duplicate (opt, addrs, mbx.name, mbx.addr))
>>> diff --git a/test/T090-search-output.sh b/test/T090-search-output.sh
>>> index 841a721..776e3f4 100755
>>> --- a/test/T090-search-output.sh
>>> +++ b/test/T090-search-output.sh
>>> @@ -390,7 +390,7 @@ test_expect_equal_file OUTPUT EXPECTED
>>>  test_begin_subtest "--output=sender"
>>>  notmuch search --output=sender '*' >OUTPUT
>>>  cat <<EOF >EXPECTED
>>> -François Boulogne <boulogne.f@gmail.com>
>>> +=?iso-8859-1?q?Fran=E7ois?= Boulogne <boulogne.f@gmail.com>
>>>  Olivier Berger <olivier.berger@it-sudparis.eu>
>>>  Chris Wilson <chris@chris-wilson.co.uk>
>>>  Carl Worth <cworth@cworth.org>
>>> @@ -437,7 +437,7 @@ test_begin_subtest "--output=recipients"
>>>  notmuch search --output=recipients '*' >OUTPUT
>>>  cat <<EOF >EXPECTED
>>>  Allan McRae <allan@archlinux.org>
>>> -Discussion about the Arch User Repository (AUR) <aur-general@archlinux.org>
>>> +"Discussion about the Arch User Repository (AUR)" <aur-general@archlinux.org>
>>>  olivier.berger@it-sudparis.eu
>>>  notmuch@notmuchmail.org
>>>  notmuch <notmuch@notmuchmail.org>
>>> @@ -449,9 +449,9 @@ test_expect_equal_file OUTPUT EXPECTED
>>>  test_begin_subtest "--output=sender --output=recipients"
>>>  notmuch search --output=sender --output=recipients '*' >OUTPUT
>>>  cat <<EOF >EXPECTED
>>> -François Boulogne <boulogne.f@gmail.com>
>>> +=?iso-8859-1?q?Fran=E7ois?= Boulogne <boulogne.f@gmail.com>
>>>  Allan McRae <allan@archlinux.org>
>>> -Discussion about the Arch User Repository (AUR) <aur-general@archlinux.org>
>>> +"Discussion about the Arch User Repository (AUR)" <aur-general@archlinux.org>
>>>  Olivier Berger <olivier.berger@it-sudparis.eu>
>>>  olivier.berger@it-sudparis.eu
>>>  Chris Wilson <chris@chris-wilson.co.uk>
>>> -- 
>>> 1.7.10.4
>> _______________________________________________
>> notmuch mailing list
>> notmuch@notmuchmail.org
>> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-10-31 16:54 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-30 23:59 [PATCH v5 0/7] notmuch search --output=sender/recipients Michal Sojka
2014-10-30 23:59 ` [PATCH v5 1/7] cli: search: Refactor passing of command line options Michal Sojka
2014-10-30 23:59 ` [PATCH v5 2/7] cli: Add support for parsing keyword-flag arguments Michal Sojka
2014-10-30 23:59 ` [PATCH v5 3/7] cli: search: Convert --output to keyword-flag argument Michal Sojka
2014-10-30 23:59 ` [PATCH v5 4/7] cli: search: Add --output={sender,recipients} Michal Sojka
2014-10-30 23:59 ` [PATCH v5 5/7] cli: search: Do not output duplicate addresses Michal Sojka
2014-10-30 23:59 ` [PATCH v5 6/7] cli: search: Add --output=count Michal Sojka
2014-10-30 23:59 ` [PATCH v5 7/7] cli: search: Add --filter-by option to configure address filtering Michal Sojka
2014-10-31  8:54 ` [PATCH v5 0/7] notmuch search --output=sender/recipients Mark Walters
2014-10-31 10:20   ` Mark Walters
2014-10-31 15:32     ` Michal Sojka
2014-10-31 16:46       ` Tomi Ollila
2014-10-31 16:54         ` Mark Walters
2014-10-31 16:47       ` Mark Walters
2014-10-31 15:03 ` Jesse Rosenthal
2014-10-31 15:43   ` Michal Sojka

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).