unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* cli: add --include-html option to notmuch show
@ 2013-07-02  0:19 John Lenz
  2013-07-21 20:23 ` Tomi Ollila
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: John Lenz @ 2013-07-02  0:19 UTC (permalink / raw)
  To: notmuch

For my client, the largest bottleneck for displaying large threads is
exporting each html part individually since by default notmuch will not
show the json parts.  For large threads there can be quite a few parts and
each must be exported and decoded one by one.  Also, I then have to deal
with all the crazy charsets which I can do through a library but is a
pain.

Therefore, this patch adds an --include-html option that causes the
text/html parts to be included as part of the output of show.


diff man/man1/notmuch-show.1
--- a/man/man1/notmuch-show.1	Sun Jun 23 14:24:02 2013 +1000
+++ b/man/man1/notmuch-show.1	Mon Jul 01 18:51:13 2013 -0500
@@ -207,6 +207,20 @@
 output is much faster and substantially smaller.
 .RE
 
+.RS 4
+.TP 4
+.B \-\-include-html
+
+Include "text/html" parts as part of the output (currently only supported with
+--format=json and --format=sexp).
+By default, unless
+.B --part=N
+is used to select a specific part or
+.B --include-html
+is used to include all "text/html" parts, no part with content type "text/html"
+is included in the output.
+.RE
+
 A common use of
 .B notmuch show
 is to display a single thread of email messages. For this, use a
diff notmuch-client.h
--- a/notmuch-client.h	Sun Jun 23 14:24:02 2013 +1000
+++ b/notmuch-client.h	Mon Jul 01 18:51:13 2013 -0500
@@ -89,6 +89,7 @@
     notmuch_bool_t raw;
     int part;
     notmuch_crypto_t crypto;
+    notmuch_bool_t include_html;
 } notmuch_show_params_t;
 
 /* There's no point in continuing when we've detected that we've done
@@ -220,7 +221,8 @@
 
 void
 format_part_sprinter (const void *ctx, struct sprinter *sp, mime_node_t *node,
-		      notmuch_bool_t first, notmuch_bool_t output_body);
+		      notmuch_bool_t first, notmuch_bool_t output_body,
+		      notmuch_bool_t include_html);
 
 void
 format_headers_sprinter (struct sprinter *sp, GMimeMessage *message,
diff notmuch-reply.c
--- a/notmuch-reply.c	Sun Jun 23 14:24:02 2013 +1000
+++ b/notmuch-reply.c	Mon Jul 01 18:51:13 2013 -0500
@@ -624,7 +624,7 @@
 
     /* Start the original */
     sp->map_key (sp, "original");
-    format_part_sprinter (ctx, sp, node, TRUE, TRUE);
+    format_part_sprinter (ctx, sp, node, TRUE, TRUE, FALSE);
 
     /* End */
     sp->end (sp);
diff notmuch-show.c
--- a/notmuch-show.c	Sun Jun 23 14:24:02 2013 +1000
+++ b/notmuch-show.c	Mon Jul 01 18:51:13 2013 -0500
@@ -630,7 +630,8 @@
 
 void
 format_part_sprinter (const void *ctx, sprinter_t *sp, mime_node_t *node,
-		      notmuch_bool_t first, notmuch_bool_t output_body)
+		      notmuch_bool_t first, notmuch_bool_t output_body,
+		      notmuch_bool_t include_html)
 {
     /* Any changes to the JSON or S-Expression format should be
      * reflected in the file devel/schemata. */
@@ -645,7 +646,7 @@
 	if (output_body) {
 	    sp->map_key (sp, "body");
 	    sp->begin_list (sp);
-	    format_part_sprinter (ctx, sp, mime_node_child (node, 0), first, TRUE);
+	    format_part_sprinter (ctx, sp, mime_node_child (node, 0), first, TRUE, include_html);
 	    sp->end (sp);
 	}
 	sp->end (sp);
@@ -700,14 +701,15 @@
 	/* For non-HTML text parts, we include the content in the
 	 * JSON. Since JSON must be Unicode, we handle charset
 	 * decoding here and do not report a charset to the caller.
-	 * For text/html parts, we do not include the content. If a
-	 * caller is interested in text/html parts, it should retrieve
-	 * them separately and they will not be decoded. Since this
-	 * makes charset decoding the responsibility on the caller, we
+	 * For text/html parts, we do not include the content unless
+	 * the --include-html option has been passed. If a html part
+	 * is not included, it can be requested directly. This makes
+	 * charset decoding the responsibility on the caller so we
 	 * report the charset for text/html parts.
 	 */
 	if (g_mime_content_type_is_type (content_type, "text", "*") &&
-	    ! g_mime_content_type_is_type (content_type, "text", "html"))
+	    (include_html ||
+	     ! g_mime_content_type_is_type (content_type, "text", "html")))
 	{
 	    GMimeStream *stream_memory = g_mime_stream_mem_new ();
 	    GByteArray *part_content;
@@ -737,7 +739,7 @@
     }
 
     for (i = 0; i < node->nchildren; i++)
-	format_part_sprinter (ctx, sp, mime_node_child (node, i), i == 0, TRUE);
+	format_part_sprinter (ctx, sp, mime_node_child (node, i), i == 0, TRUE, include_html);
 
     /* Close content structures */
     for (i = 0; i < nclose; i++)
@@ -751,7 +753,7 @@
 			    mime_node_t *node, unused (int indent),
 			    const notmuch_show_params_t *params)
 {
-    format_part_sprinter (ctx, sp, node, TRUE, params->output_body);
+    format_part_sprinter (ctx, sp, node, TRUE, params->output_body, params->include_html);
 
     return NOTMUCH_STATUS_SUCCESS;
 }
@@ -1077,7 +1079,8 @@
 	.crypto = {
 	    .verify = FALSE,
 	    .decrypt = FALSE
-	}
+	},
+	.include_html = FALSE
     };
     int format_sel = NOTMUCH_FORMAT_NOT_SPECIFIED;
     int exclude = EXCLUDE_TRUE;
@@ -1105,6 +1108,7 @@
 	{ NOTMUCH_OPT_BOOLEAN, &params.crypto.decrypt, "decrypt", 'd', 0 },
 	{ NOTMUCH_OPT_BOOLEAN, &params.crypto.verify, "verify", 'v', 0 },
 	{ NOTMUCH_OPT_BOOLEAN, &params.output_body, "body", 'b', 0 },
+	{ NOTMUCH_OPT_BOOLEAN, &params.include_html, "include-html", 0, 0 },
 	{ 0, 0, 0, 0, 0 }
     };
 
@@ -1176,6 +1180,11 @@
 	}
     }
 
+    if (params.include_html &&
+        (format_sel != NOTMUCH_FORMAT_JSON && format_sel != NOTMUCH_FORMAT_SEXP)) {
+	fprintf (stderr, "Warning: --include-html only implemented for format=json and format=sexp\n");
+    }
+
     if (entire_thread == ENTIRE_THREAD_TRUE)
 	params.entire_thread = TRUE;
     else

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cli: add --include-html option to notmuch show
  2013-07-02  0:19 cli: add --include-html option to notmuch show John Lenz
@ 2013-07-21 20:23 ` Tomi Ollila
  2013-07-22 16:49   ` John Lenz
  2013-07-25  2:36   ` John Lenz
  2013-08-24 15:29 ` [PATCH 1/1] test: test notmuch show --include-html option Tomi Ollila
  2013-08-27 11:03 ` cli: add --include-html option to notmuch show David Bremner
  2 siblings, 2 replies; 14+ messages in thread
From: Tomi Ollila @ 2013-07-21 20:23 UTC (permalink / raw)
  To: John Lenz, notmuch

On Tue, Jul 02 2013, John Lenz <lenz@math.uic.edu> wrote:

> For my client, the largest bottleneck for displaying large threads is
> exporting each html part individually since by default notmuch will not
> show the json parts.  For large threads there can be quite a few parts and
> each must be exported and decoded one by one.  Also, I then have to deal
> with all the crazy charsets which I can do through a library but is a
> pain.

This looks like a useful option. I just wonder what effect does different
charsets do to the output (is text/html content output verbatim (with just
json/sexp escaping of '"' -characters). 

If you added test(s) showing what happens with different charsets
(like one message having 3 text/html parts, one us-ascii, one iso-8859-1
and one utf-8) that would make things clearer and (also) protect us from 
regressions.

Tomi


> Therefore, this patch adds an --include-html option that causes the
> text/html parts to be included as part of the output of show.
>
>
> diff man/man1/notmuch-show.1
> --- a/man/man1/notmuch-show.1	Sun Jun 23 14:24:02 2013 +1000
> +++ b/man/man1/notmuch-show.1	Mon Jul 01 18:51:13 2013 -0500
> @@ -207,6 +207,20 @@
>  output is much faster and substantially smaller.
>  .RE
>  
> +.RS 4
> +.TP 4
> +.B \-\-include-html
> +
> +Include "text/html" parts as part of the output (currently only supported with
> +--format=json and --format=sexp).
> +By default, unless
> +.B --part=N
> +is used to select a specific part or
> +.B --include-html
> +is used to include all "text/html" parts, no part with content type "text/html"
> +is included in the output.
> +.RE
> +
>  A common use of
>  .B notmuch show
>  is to display a single thread of email messages. For this, use a
> diff notmuch-client.h
> --- a/notmuch-client.h	Sun Jun 23 14:24:02 2013 +1000
> +++ b/notmuch-client.h	Mon Jul 01 18:51:13 2013 -0500
> @@ -89,6 +89,7 @@
>      notmuch_bool_t raw;
>      int part;
>      notmuch_crypto_t crypto;
> +    notmuch_bool_t include_html;
>  } notmuch_show_params_t;
>  
>  /* There's no point in continuing when we've detected that we've done
> @@ -220,7 +221,8 @@
>  
>  void
>  format_part_sprinter (const void *ctx, struct sprinter *sp, mime_node_t *node,
> -		      notmuch_bool_t first, notmuch_bool_t output_body);
> +		      notmuch_bool_t first, notmuch_bool_t output_body,
> +		      notmuch_bool_t include_html);
>  
>  void
>  format_headers_sprinter (struct sprinter *sp, GMimeMessage *message,
> diff notmuch-reply.c
> --- a/notmuch-reply.c	Sun Jun 23 14:24:02 2013 +1000
> +++ b/notmuch-reply.c	Mon Jul 01 18:51:13 2013 -0500
> @@ -624,7 +624,7 @@
>  
>      /* Start the original */
>      sp->map_key (sp, "original");
> -    format_part_sprinter (ctx, sp, node, TRUE, TRUE);
> +    format_part_sprinter (ctx, sp, node, TRUE, TRUE, FALSE);
>  
>      /* End */
>      sp->end (sp);
> diff notmuch-show.c
> --- a/notmuch-show.c	Sun Jun 23 14:24:02 2013 +1000
> +++ b/notmuch-show.c	Mon Jul 01 18:51:13 2013 -0500
> @@ -630,7 +630,8 @@
>  
>  void
>  format_part_sprinter (const void *ctx, sprinter_t *sp, mime_node_t *node,
> -		      notmuch_bool_t first, notmuch_bool_t output_body)
> +		      notmuch_bool_t first, notmuch_bool_t output_body,
> +		      notmuch_bool_t include_html)
>  {
>      /* Any changes to the JSON or S-Expression format should be
>       * reflected in the file devel/schemata. */
> @@ -645,7 +646,7 @@
>  	if (output_body) {
>  	    sp->map_key (sp, "body");
>  	    sp->begin_list (sp);
> -	    format_part_sprinter (ctx, sp, mime_node_child (node, 0), first, TRUE);
> +	    format_part_sprinter (ctx, sp, mime_node_child (node, 0), first, TRUE, include_html);
>  	    sp->end (sp);
>  	}
>  	sp->end (sp);
> @@ -700,14 +701,15 @@
>  	/* For non-HTML text parts, we include the content in the
>  	 * JSON. Since JSON must be Unicode, we handle charset
>  	 * decoding here and do not report a charset to the caller.
> -	 * For text/html parts, we do not include the content. If a
> -	 * caller is interested in text/html parts, it should retrieve
> -	 * them separately and they will not be decoded. Since this
> -	 * makes charset decoding the responsibility on the caller, we
> +	 * For text/html parts, we do not include the content unless
> +	 * the --include-html option has been passed. If a html part
> +	 * is not included, it can be requested directly. This makes
> +	 * charset decoding the responsibility on the caller so we
>  	 * report the charset for text/html parts.
>  	 */
>  	if (g_mime_content_type_is_type (content_type, "text", "*") &&
> -	    ! g_mime_content_type_is_type (content_type, "text", "html"))
> +	    (include_html ||
> +	     ! g_mime_content_type_is_type (content_type, "text", "html")))
>  	{
>  	    GMimeStream *stream_memory = g_mime_stream_mem_new ();
>  	    GByteArray *part_content;
> @@ -737,7 +739,7 @@
>      }
>  
>      for (i = 0; i < node->nchildren; i++)
> -	format_part_sprinter (ctx, sp, mime_node_child (node, i), i == 0, TRUE);
> +	format_part_sprinter (ctx, sp, mime_node_child (node, i), i == 0, TRUE, include_html);
>  
>      /* Close content structures */
>      for (i = 0; i < nclose; i++)
> @@ -751,7 +753,7 @@
>  			    mime_node_t *node, unused (int indent),
>  			    const notmuch_show_params_t *params)
>  {
> -    format_part_sprinter (ctx, sp, node, TRUE, params->output_body);
> +    format_part_sprinter (ctx, sp, node, TRUE, params->output_body, params->include_html);
>  
>      return NOTMUCH_STATUS_SUCCESS;
>  }
> @@ -1077,7 +1079,8 @@
>  	.crypto = {
>  	    .verify = FALSE,
>  	    .decrypt = FALSE
> -	}
> +	},
> +	.include_html = FALSE
>      };
>      int format_sel = NOTMUCH_FORMAT_NOT_SPECIFIED;
>      int exclude = EXCLUDE_TRUE;
> @@ -1105,6 +1108,7 @@
>  	{ NOTMUCH_OPT_BOOLEAN, &params.crypto.decrypt, "decrypt", 'd', 0 },
>  	{ NOTMUCH_OPT_BOOLEAN, &params.crypto.verify, "verify", 'v', 0 },
>  	{ NOTMUCH_OPT_BOOLEAN, &params.output_body, "body", 'b', 0 },
> +	{ NOTMUCH_OPT_BOOLEAN, &params.include_html, "include-html", 0, 0 },
>  	{ 0, 0, 0, 0, 0 }
>      };
>  
> @@ -1176,6 +1180,11 @@
>  	}
>      }
>  
> +    if (params.include_html &&
> +        (format_sel != NOTMUCH_FORMAT_JSON && format_sel != NOTMUCH_FORMAT_SEXP)) {
> +	fprintf (stderr, "Warning: --include-html only implemented for format=json and format=sexp\n");
> +    }
> +
>      if (entire_thread == ENTIRE_THREAD_TRUE)
>  	params.entire_thread = TRUE;
>      else
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cli: add --include-html option to notmuch show
  2013-07-21 20:23 ` Tomi Ollila
@ 2013-07-22 16:49   ` John Lenz
  2013-07-25  2:36   ` John Lenz
  1 sibling, 0 replies; 14+ messages in thread
From: John Lenz @ 2013-07-22 16:49 UTC (permalink / raw)
  To: Tomi Ollila, notmuch

On Sun Jul 21 15:23 -0500 2013, Tomi Ollila <tomi.ollila@iki.fi> wrote:
> On Tue, Jul 02 2013, John Lenz <lenz@math.uic.edu> wrote:
>
> > For my client, the largest bottleneck for displaying large threads is
> > exporting each html part individually since by default notmuch will not
> > show the json parts.  For large threads there can be quite a few parts and
> > each must be exported and decoded one by one.  Also, I then have to deal
> > with all the crazy charsets which I can do through a library but is a
> > pain.
>
> This looks like a useful option. I just wonder what effect does different
> charsets do to the output (is text/html content output verbatim (with just
> json/sexp escaping of '"' -characters).
>
> If you added test(s) showing what happens with different charsets
> (like one message having 3 text/html parts, one us-ascii, one iso-8859-1
> and one utf-8) that would make things clearer and (also) protect us from 
> regressions.
>

Ok, I'll add some tests, but everything is converted to UTF-8 by gmime.  If
you look, I didn't add any extra code actually.  Instead I just changed
the if branch taken depending on the option and the content type.  The existing
code already converted everything to UTF-8.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cli: add --include-html option to notmuch show
  2013-07-21 20:23 ` Tomi Ollila
  2013-07-22 16:49   ` John Lenz
@ 2013-07-25  2:36   ` John Lenz
  2013-08-04 19:47     ` Tomi Ollila
  1 sibling, 1 reply; 14+ messages in thread
From: John Lenz @ 2013-07-25  2:36 UTC (permalink / raw)
  To: Tomi Ollila, notmuch

On Sun Jul 21 15:23 -0500 2013, Tomi Ollila <tomi.ollila@iki.fi> wrote:
> On Tue, Jul 02 2013, John Lenz <lenz@math.uic.edu> wrote:
> 
> > For my client, the largest bottleneck for displaying large threads is
> > exporting each html part individually since by default notmuch will not
> > show the json parts.  For large threads there can be quite a few parts and
> > each must be exported and decoded one by one.  Also, I then have to deal
> > with all the crazy charsets which I can do through a library but is a
> > pain.
> 
> This looks like a useful option. I just wonder what effect does different
> charsets do to the output (is text/html content output verbatim (with just
> json/sexp escaping of '"' -characters). 
> 
> If you added test(s) showing what happens with different charsets
> (like one message having 3 text/html parts, one us-ascii, one iso-8859-1
> and one utf-8) that would make things clearer and (also) protect us from 
> regressions.
> 


Here is a test I wrote.  I tried to follow the other tests in formatting.
Let me know if you want this as a single patch combined with the code
to enable the option, I can resend it.



#!/usr/bin/env bash
test_description="include html parts when showing message"
. ./test-lib.sh

cat <<EOF > ${MAIL_DIR}/msg
From: A <a@example.com>
To: B <b@example.com>
Subject: html message
Date: Sat, 01 January 2000 00:00:00 +0000
Message-ID: <htmlmessage>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="==-=="

--==-==
Content-Type: text/html; charset=UTF-8

EOF
# The Unicode fraction symbol 1/2 is U+00BD and is encoded
# in UTF-8 as two bytes: octal 302 275
echo $'<p>0.5 equals \302\275</p>' >> ${MAIL_DIR}/msg
cat <<EOF >> ${MAIL_DIR}/msg

--==-==
Content-Type: text/html; charset=ISO-8859-1

EOF
# The ISO-8859-1 encoding of U+00BD is a single byte: octal 275
echo $'<p>0.5 equals \275</p>' >> ${MAIL_DIR}/msg
cat <<EOF >> ${MAIL_DIR}/msg

--==-==
Content-Type: text/plain; charset=UTF-8

0.5 equals 1/2

--==-==--
EOF

notmuch new > /dev/null


cat <<EOF > EXPECTED.head
[[[{"id": "htmlmessage", "match":true, "excluded": false, "date_relative":"2000-01-01",
   "timestamp": 946684800,
   "filename": "${MAIL_DIR}/msg",
   "tags": ["inbox", "unread"],
   "headers": { "Date": "Sat, 01 Jan 2000 00:00:00 +0000", "From": "A <a@example.com>",
                "Subject": "html message", "To": "B <b@example.com>"},
   "body": [{
     "content-type": "multipart/alternative", "id": 1,
EOF

cat EXPECTED.head > EXPECTED.nohtml
cat <<EOF >> EXPECTED.nohtml
"content": [
  { "id": 2, "content-charset": "UTF-8", "content-length": 21, "content-type": "text/html"},
  { "id": 3, "content-charset": "ISO-8859-1", "content-length": 20, "content-type": "text/html"},
  { "id": 4, "content-type": "text/plain", "content": "0.5 equals 1/2\\n"}
]}]},[]]]]
EOF

# Both the UTF-8 and ISO-8859-1 part should have U+00BD
cat EXPECTED.head > EXPECTED.withhtml
cat <<EOF >> EXPECTED.withhtml
"content": [
  { "id": 2, "content-type": "text/html", "content": "<p>0.5 equals \\u00bd</p>\\n"},
  { "id": 3, "content-type": "text/html", "content": "<p>0.5 equals \\u00bd</p>\\n"},
  { "id": 4, "content-type": "text/plain", "content": "0.5 equals 1/2\\n"}
]}]},[]]]]
EOF

test_begin_subtest "html parts excluded by default"
notmuch show --format=json id:htmlmessage >OUTPUT
test_expect_equal_json "$(cat OUTPUT)" "$(cat EXPECTED.nohtml)"

test_begin_subtest "html parts included"
notmuch show --format=json --include-html id:htmlmessage > OUTPUT
test_expect_equal_json "$(cat OUTPUT)" "$(cat EXPECTED.withhtml)"

test_done

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cli: add --include-html option to notmuch show
  2013-07-25  2:36   ` John Lenz
@ 2013-08-04 19:47     ` Tomi Ollila
  2013-08-17 15:52       ` John Lenz
  0 siblings, 1 reply; 14+ messages in thread
From: Tomi Ollila @ 2013-08-04 19:47 UTC (permalink / raw)
  To: John Lenz, notmuch

On Thu, Jul 25 2013, John Lenz <lenz@math.uic.edu> wrote:

> On Sun Jul 21 15:23 -0500 2013, Tomi Ollila <tomi.ollila@iki.fi> wrote:
>> On Tue, Jul 02 2013, John Lenz <lenz@math.uic.edu> wrote:
>> 
>> > For my client, the largest bottleneck for displaying large threads is
>> > exporting each html part individually since by default notmuch will not
>> > show the json parts.  For large threads there can be quite a few parts and
>> > each must be exported and decoded one by one.  Also, I then have to deal
>> > with all the crazy charsets which I can do through a library but is a
>> > pain.
>> 
>> This looks like a useful option. I just wonder what effect does different
>> charsets do to the output (is text/html content output verbatim (with just
>> json/sexp escaping of '"' -characters). 
>> 
>> If you added test(s) showing what happens with different charsets
>> (like one message having 3 text/html parts, one us-ascii, one iso-8859-1
>> and one utf-8) that would make things clearer and (also) protect us from 
>> regressions.
>> 

> Here is a test I wrote.  I tried to follow the other tests in formatting.
> Let me know if you want this as a single patch combined with the code
> to enable the option, I can resend it.

I took your patch, modified it a bit and put it at the end of 'multipart'
test. The diff for viewing is attached at the end.

The next question is should we have new option as

--include-html

or as

--include-html=(true|false)

or even

--body=(true|false|text-and-html)

See --exclude option in http://notmuchmail.org/manpages/notmuch-search-1/
and --body option in http://notmuchmail.org/manpages/notmuch-show-1/
for comparison...


Tomi

--8<----8<----8<----8<----8<--

diff --git a/test/multipart b/test/multipart
index c974226..11f10bd 100755
--- a/test/multipart
+++ b/test/multipart
@@ -647,4 +647,84 @@ notmuch show --format=raw --part=3 id:base64-part-with-crlf > crlf.out
 echo -n -e "\xEF\x0D\x0A" > crlf.expected
 test_expect_equal_file crlf.out crlf.expected
 
-test_done
\ No newline at end of file
+
+# The ISO-8859-1 encoding of U+00BD is a single byte: octal 275
+# (Portability note: Dollar-Single ($'...', ANSI C-style escape sequences)
+# quoting works on bash, ksh, zsh, *BSD sh but not on dash, ash nor busybox sh)
+readonly u_00bd_latin1=$'\275'
+
+# The Unicode fraction symbol 1/2 is U+00BD and is encoded
+# in UTF-8 as two bytes: octal 302 275
+readonly u_00bd_utf8=$'\302\275'
+
+cat <<EOF > ${MAIL_DIR}/include-html
+From: A <a@example.com>
+To: B <b@example.com>
+Subject: html message
+Date: Sat, 01 January 2000 00:00:00 +0000
+Message-ID: <htmlmessage>
+MIME-Version: 1.0
+Content-Type: multipart/alternative; boundary="==-=="
+
+--==-==
+Content-Type: text/html; charset=UTF-8
+
+<p>0.5 equals ${u_00bd_utf8}</p>
+
+--==-==
+Content-Type: text/html; charset=ISO-8859-1
+
+<p>0.5 equals ${u_00bd_latin1}</p>
+
+--==-==
+Content-Type: text/plain; charset=UTF-8
+
+0.5 equals ${u_00bd_utf8}
+
+--==-==--
+EOF
+
+notmuch new > /dev/null
+
+cat_expected_head ()
+{
+        cat <<EOF
+[[[{"id": "htmlmessage", "match":true, "excluded": false, "date_relative":"2000-01-01",
+   "timestamp": 946684800,
+   "filename": "${MAIL_DIR}/include-html",
+   "tags": ["inbox", "unread"],
+   "headers": { "Date": "Sat, 01 Jan 2000 00:00:00 +0000", "From": "A <a@example.com>",
+                "Subject": "html message", "To": "B <b@example.com>"},
+   "body": [{
+     "content-type": "multipart/alternative", "id": 1,
+EOF
+}
+
+cat_expected_head > EXPECTED.nohtml
+cat <<EOF >> EXPECTED.nohtml
+"content": [
+  { "id": 2, "content-charset": "UTF-8", "content-length": 21, "content-type": "text/html"},
+  { "id": 3, "content-charset": "ISO-8859-1", "content-length": 20, "content-type": "text/html"},
+  { "id": 4, "content-type": "text/plain", "content": "0.5 equals \\u00bd\\n"}
+]}]},[]]]]
+EOF
+
+# Both the UTF-8 and ISO-8859-1 part should have U+00BD
+cat_expected_head > EXPECTED.withhtml
+cat <<EOF >> EXPECTED.withhtml
+"content": [
+  { "id": 2, "content-type": "text/html", "content": "<p>0.5 equals \\u00bd</p>\\n"},
+  { "id": 3, "content-type": "text/html", "content": "<p>0.5 equals \\u00bd</p>\\n"},
+  { "id": 4, "content-type": "text/plain", "content": "0.5 equals \\u00bd\\n"}
+]}]},[]]]]
+EOF
+
+test_begin_subtest "html parts excluded by default"
+notmuch show --format=json id:htmlmessage > OUTPUT
+test_expect_equal_json "$(cat OUTPUT)" "$(cat EXPECTED.nohtml)"
+
+test_begin_subtest "html parts included"
+notmuch show --format=json --include-html id:htmlmessage > OUTPUT
+test_expect_equal_json "$(cat OUTPUT)" "$(cat EXPECTED.withhtml)"
+
+test_done

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: cli: add --include-html option to notmuch show
  2013-08-04 19:47     ` Tomi Ollila
@ 2013-08-17 15:52       ` John Lenz
  2013-08-18 11:25         ` Jani Nikula
  0 siblings, 1 reply; 14+ messages in thread
From: John Lenz @ 2013-08-17 15:52 UTC (permalink / raw)
  To: Tomi Ollila, notmuch

On Sun Aug  4 14:47 -0500 2013, Tomi Ollila <tomi.ollila@iki.fi> wrote:
> The next question is should we have new option as
>
> --include-html
>
> or as
>
> --include-html=(true|false)
>
> or even
>
> --body=(true|false|text-and-html)
>
> See --exclude option in http://notmuchmail.org/manpages/notmuch-search-1/
> and --body option in http://notmuchmail.org/manpages/notmuch-show-1/
> for comparison...
>

I have no preference here, although I guess I would vote for
--include-html=(true|false) since adding it to --body makes the --body
options confusing: to make sense the body options should be
--body=(text|text-and-html|none) but of course you can't change that and
break the command line API.  Well, maybe you could add all three
--body=(text|text-and-html|none)  and still accept true/false for
compatibility.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cli: add --include-html option to notmuch show
  2013-08-17 15:52       ` John Lenz
@ 2013-08-18 11:25         ` Jani Nikula
  2013-08-18 18:30           ` Tomi Ollila
  0 siblings, 1 reply; 14+ messages in thread
From: Jani Nikula @ 2013-08-18 11:25 UTC (permalink / raw)
  To: John Lenz, Tomi Ollila, notmuch

On Sat, 17 Aug 2013, John Lenz <lenz@math.uic.edu> wrote:
> On Sun Aug  4 14:47 -0500 2013, Tomi Ollila <tomi.ollila@iki.fi> wrote:
>> The next question is should we have new option as
>> 
>> --include-html
>> 
>> or as
>> 
>> --include-html=(true|false)
>> 
>> or even
>> 
>> --body=(true|false|text-and-html)
>> 
>> See --exclude option in http://notmuchmail.org/manpages/notmuch-search-1/
>> and --body option in http://notmuchmail.org/manpages/notmuch-show-1/
>> for comparison...
>> 
>
> I have no preference here, although I guess I would vote for
> --include-html=(true|false) since adding it to --body makes the --body
> options confusing: to make sense the body options should be
> --body=(text|text-and-html|none) but of course you can't change that and
> break the command line API.  Well, maybe you could add all three
> --body=(text|text-and-html|none)  and still accept true/false for
> compatibility.

Hi John & Tomi -

We could trivially amend the argument parser to |= the keyword values
(instead of =) to allow specifying keywords arguments multiple times on
the command line. With the keyword values specified as bit flags, we
could then have 'notmuch show --body=text --body=html ...' return both
text and html, and allow trivial future extension too.

--body=true and --body=false could be handled specially for backwards
compatibility, for example by forcing text only or no parts,
respectively. Since the default is currently text, --body=none might be
a suitable synonym for --body=false, while the boolean alternatives
could be deprecated.

A little less trivially it's also possible to support e.g. comma
separated keyword values, such as --body=text,html but I do prefer the
(implementation) simplicity of the above.

Untested patch to the argument parser below.

Cheers,
Jani.


diff --git a/command-line-arguments.c b/command-line-arguments.c
index bf9aeca..c426054 100644
--- a/command-line-arguments.c
+++ b/command-line-arguments.c
@@ -23,7 +23,10 @@ _process_keyword_arg (const notmuch_opt_desc_t *arg_desc, char next, const char
     while (keywords->name) {
 	if (strcmp (arg_str, keywords->name) == 0) {
 	    if (arg_desc->output_var) {
-		*((int *)arg_desc->output_var) = keywords->value;
+		if (arg_desc->opt_type == NOTMUCH_OPT_KEYWORD_FLAGS)
+		    *((int *)arg_desc->output_var) |= keywords->value;
+		else
+		    *((int *)arg_desc->output_var) = keywords->value;
 	    }
 	    return TRUE;
 	}
@@ -146,6 +149,7 @@ parse_option (const char *arg,
 
 	    switch (try->opt_type) {
 	    case NOTMUCH_OPT_KEYWORD:
+	    case NOTMUCH_OPT_KEYWORD_FLAGS:
 		return _process_keyword_arg (try, next, value);
 		break;
 	    case NOTMUCH_OPT_BOOLEAN:
diff --git a/command-line-arguments.h b/command-line-arguments.h
index de1734a..085a492 100644
--- a/command-line-arguments.h
+++ b/command-line-arguments.h
@@ -8,6 +8,7 @@ enum notmuch_opt_type {
     NOTMUCH_OPT_BOOLEAN,	/* --verbose              */
     NOTMUCH_OPT_INT,		/* --frob=8               */
     NOTMUCH_OPT_KEYWORD,	/* --format=raw|json|text */
+    NOTMUCH_OPT_KEYWORD_FLAGS,  /* the above with values OR'd together */
     NOTMUCH_OPT_STRING,		/* --file=/tmp/gnarf.txt  */
     NOTMUCH_OPT_POSITION	/* notmuch dump pos_arg   */
 };

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: cli: add --include-html option to notmuch show
  2013-08-18 11:25         ` Jani Nikula
@ 2013-08-18 18:30           ` Tomi Ollila
  2013-08-24  8:11             ` Mark Walters
  2013-08-24 10:59             ` Jani Nikula
  0 siblings, 2 replies; 14+ messages in thread
From: Tomi Ollila @ 2013-08-18 18:30 UTC (permalink / raw)
  To: Jani Nikula, John Lenz, notmuch

On Sun, Aug 18 2013, Jani Nikula <jani@nikula.org> wrote:

> On Sat, 17 Aug 2013, John Lenz <lenz@math.uic.edu> wrote:
>> On Sun Aug  4 14:47 -0500 2013, Tomi Ollila <tomi.ollila@iki.fi> wrote:
>>> The next question is should we have new option as
>>> 
>>> --include-html
>>> 
>>> or as
>>> 
>>> --include-html=(true|false)
>>> 
>>> or even
>>> 
>>> --body=(true|false|text-and-html)
>>> 
>>> See --exclude option in http://notmuchmail.org/manpages/notmuch-search-1/
>>> and --body option in http://notmuchmail.org/manpages/notmuch-show-1/
>>> for comparison...
>>> 
>>
>> I have no preference here, although I guess I would vote for
>> --include-html=(true|false) since adding it to --body makes the --body
>> options confusing: to make sense the body options should be
>> --body=(text|text-and-html|none) but of course you can't change that and
>> break the command line API.  Well, maybe you could add all three
>> --body=(text|text-and-html|none)  and still accept true/false for
>> compatibility.
>
> Hi John & Tomi -
>
> We could trivially amend the argument parser to |= the keyword values
> (instead of =) to allow specifying keywords arguments multiple times on
> the command line. With the keyword values specified as bit flags, we
> could then have 'notmuch show --body=text --body=html ...' return both
> text and html, and allow trivial future extension too.
>
> --body=true and --body=false could be handled specially for backwards
> compatibility, for example by forcing text only or no parts,
> respectively. Since the default is currently text, --body=none might be
> a suitable synonym for --body=false, while the boolean alternatives
> could be deprecated.
>
> A little less trivially it's also possible to support e.g. comma
> separated keyword values, such as --body=text,html but I do prefer the
> (implementation) simplicity of the above.

I've also thought along these lines (except this possibility to give 
argument multiple times)...

But when I wrote my first reply I did not realise that the default
behaviour is to include all text/* parts *except* text/html. i.e.
text/html is excluded as as special case (and non-excluded parts
contain text/plain, text/calendar, text/whatnot etc...). 

How to do clean interface/implementation using --body is not trivial
(if possible). I played with options like
false/none -- true/text/plain -- all/textall/* and came to a conclusion
that maybe --include-html is the best option after all.

Now, if we have --include-html should it be like that or 
--include-html=(true|false). Currently we have both cases, adding
--verify, --decrypt, --create-folder, --batch, -no-hooks to the
set... I cannot get a clear opinion (without wast^H^H^H^H spending
excessive amount of time figuring these out) how this should be,
therefore I'm inclined to the opinion that

the current patch from John with simple --include-html could be applied,
and in the future (if it is of anyone's interest) we update the parser
allowing boolean --arg equal --arg=true. Then it is just how we decide
to document these...

Tomi

>
> Untested patch to the argument parser below.
>
> Cheers,
> Jani.
>
>
> diff --git a/command-line-arguments.c b/command-line-arguments.c
> index bf9aeca..c426054 100644
> --- a/command-line-arguments.c
> +++ b/command-line-arguments.c
> @@ -23,7 +23,10 @@ _process_keyword_arg (const notmuch_opt_desc_t *arg_desc, char next, const char
>      while (keywords->name) {
>  	if (strcmp (arg_str, keywords->name) == 0) {
>  	    if (arg_desc->output_var) {
> -		*((int *)arg_desc->output_var) = keywords->value;
> +		if (arg_desc->opt_type == NOTMUCH_OPT_KEYWORD_FLAGS)
> +		    *((int *)arg_desc->output_var) |= keywords->value;
> +		else
> +		    *((int *)arg_desc->output_var) = keywords->value;
>  	    }
>  	    return TRUE;
>  	}
> @@ -146,6 +149,7 @@ parse_option (const char *arg,
>  
>  	    switch (try->opt_type) {
>  	    case NOTMUCH_OPT_KEYWORD:
> +	    case NOTMUCH_OPT_KEYWORD_FLAGS:
>  		return _process_keyword_arg (try, next, value);
>  		break;
>  	    case NOTMUCH_OPT_BOOLEAN:
> diff --git a/command-line-arguments.h b/command-line-arguments.h
> index de1734a..085a492 100644
> --- a/command-line-arguments.h
> +++ b/command-line-arguments.h
> @@ -8,6 +8,7 @@ enum notmuch_opt_type {
>      NOTMUCH_OPT_BOOLEAN,	/* --verbose              */
>      NOTMUCH_OPT_INT,		/* --frob=8               */
>      NOTMUCH_OPT_KEYWORD,	/* --format=raw|json|text */
> +    NOTMUCH_OPT_KEYWORD_FLAGS,  /* the above with values OR'd together */
>      NOTMUCH_OPT_STRING,		/* --file=/tmp/gnarf.txt  */
>      NOTMUCH_OPT_POSITION	/* notmuch dump pos_arg   */
>  };
>
>
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cli: add --include-html option to notmuch show
  2013-08-18 18:30           ` Tomi Ollila
@ 2013-08-24  8:11             ` Mark Walters
  2013-08-24 10:59             ` Jani Nikula
  1 sibling, 0 replies; 14+ messages in thread
From: Mark Walters @ 2013-08-24  8:11 UTC (permalink / raw)
  To: Tomi Ollila, Jani Nikula, John Lenz, notmuch


Hi

Overall I like this patch.

On Sun, 18 Aug 2013, Tomi Ollila <tomi.ollila@iki.fi> wrote:
> On Sun, Aug 18 2013, Jani Nikula <jani@nikula.org> wrote:
>
>> On Sat, 17 Aug 2013, John Lenz <lenz@math.uic.edu> wrote:
>>> On Sun Aug  4 14:47 -0500 2013, Tomi Ollila <tomi.ollila@iki.fi> wrote:
>>>> The next question is should we have new option as
>>>> 
>>>> --include-html
>>>> 
>>>> or as
>>>> 
>>>> --include-html=(true|false)
>>>> 
>>>> or even
>>>> 
>>>> --body=(true|false|text-and-html)
>>>> 
>>>> See --exclude option in http://notmuchmail.org/manpages/notmuch-search-1/
>>>> and --body option in http://notmuchmail.org/manpages/notmuch-show-1/
>>>> for comparison...
>>>> 
>>>
>>> I have no preference here, although I guess I would vote for
>>> --include-html=(true|false) since adding it to --body makes the --body
>>> options confusing: to make sense the body options should be
>>> --body=(text|text-and-html|none) but of course you can't change that and
>>> break the command line API.  Well, maybe you could add all three
>>> --body=(text|text-and-html|none)  and still accept true/false for
>>> compatibility.
>>
>> Hi John & Tomi -
>>
>> We could trivially amend the argument parser to |= the keyword values
>> (instead of =) to allow specifying keywords arguments multiple times on
>> the command line. With the keyword values specified as bit flags, we
>> could then have 'notmuch show --body=text --body=html ...' return both
>> text and html, and allow trivial future extension too.
>>
>> --body=true and --body=false could be handled specially for backwards
>> compatibility, for example by forcing text only or no parts,
>> respectively. Since the default is currently text, --body=none might be
>> a suitable synonym for --body=false, while the boolean alternatives
>> could be deprecated.
>>
>> A little less trivially it's also possible to support e.g. comma
>> separated keyword values, such as --body=text,html but I do prefer the
>> (implementation) simplicity of the above.
>
> I've also thought along these lines (except this possibility to give 
> argument multiple times)...
>
> But when I wrote my first reply I did not realise that the default
> behaviour is to include all text/* parts *except* text/html. i.e.
> text/html is excluded as as special case (and non-excluded parts
> contain text/plain, text/calendar, text/whatnot etc...). 
>
> How to do clean interface/implementation using --body is not trivial
> (if possible). I played with options like
> false/none -- true/text/plain -- all/textall/* and came to a conclusion
> that maybe --include-html is the best option after all.
>
> Now, if we have --include-html should it be like that or 
> --include-html=(true|false). Currently we have both cases, adding
> --verify, --decrypt, --create-folder, --batch, -no-hooks to the
> set... I cannot get a clear opinion (without wast^H^H^H^H spending
> excessive amount of time figuring these out) how this should be,
> therefore I'm inclined to the opinion that
>
> the current patch from John with simple --include-html could be applied,
> and in the future (if it is of anyone's interest) we update the parser
> allowing boolean --arg equal --arg=true. Then it is just how we decide
> to document these...

I agree with Tomi on all of these points.

I think that with several patches floating around in this thread (the
original, some test, Tomi's modified tests) it would be good to have a
new candidate series submitted. I think it would get my +1.

Best wishes

Mark



>
> Tomi
>
>>
>> Untested patch to the argument parser below.
>>
>> Cheers,
>> Jani.
>>
>>
>> diff --git a/command-line-arguments.c b/command-line-arguments.c
>> index bf9aeca..c426054 100644
>> --- a/command-line-arguments.c
>> +++ b/command-line-arguments.c
>> @@ -23,7 +23,10 @@ _process_keyword_arg (const notmuch_opt_desc_t *arg_desc, char next, const char
>>      while (keywords->name) {
>>  	if (strcmp (arg_str, keywords->name) == 0) {
>>  	    if (arg_desc->output_var) {
>> -		*((int *)arg_desc->output_var) = keywords->value;
>> +		if (arg_desc->opt_type == NOTMUCH_OPT_KEYWORD_FLAGS)
>> +		    *((int *)arg_desc->output_var) |= keywords->value;
>> +		else
>> +		    *((int *)arg_desc->output_var) = keywords->value;
>>  	    }
>>  	    return TRUE;
>>  	}
>> @@ -146,6 +149,7 @@ parse_option (const char *arg,
>>  
>>  	    switch (try->opt_type) {
>>  	    case NOTMUCH_OPT_KEYWORD:
>> +	    case NOTMUCH_OPT_KEYWORD_FLAGS:
>>  		return _process_keyword_arg (try, next, value);
>>  		break;
>>  	    case NOTMUCH_OPT_BOOLEAN:
>> diff --git a/command-line-arguments.h b/command-line-arguments.h
>> index de1734a..085a492 100644
>> --- a/command-line-arguments.h
>> +++ b/command-line-arguments.h
>> @@ -8,6 +8,7 @@ enum notmuch_opt_type {
>>      NOTMUCH_OPT_BOOLEAN,	/* --verbose              */
>>      NOTMUCH_OPT_INT,		/* --frob=8               */
>>      NOTMUCH_OPT_KEYWORD,	/* --format=raw|json|text */
>> +    NOTMUCH_OPT_KEYWORD_FLAGS,  /* the above with values OR'd together */
>>      NOTMUCH_OPT_STRING,		/* --file=/tmp/gnarf.txt  */
>>      NOTMUCH_OPT_POSITION	/* notmuch dump pos_arg   */
>>  };
>>
>>
>>
>> _______________________________________________
>> notmuch mailing list
>> notmuch@notmuchmail.org
>> http://notmuchmail.org/mailman/listinfo/notmuch
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cli: add --include-html option to notmuch show
  2013-08-18 18:30           ` Tomi Ollila
  2013-08-24  8:11             ` Mark Walters
@ 2013-08-24 10:59             ` Jani Nikula
  1 sibling, 0 replies; 14+ messages in thread
From: Jani Nikula @ 2013-08-24 10:59 UTC (permalink / raw)
  To: Tomi Ollila, John Lenz, notmuch

On Sun, 18 Aug 2013, Tomi Ollila <tomi.ollila@iki.fi> wrote:
> Now, if we have --include-html should it be like that or 
> --include-html=(true|false). Currently we have both cases, adding
> --verify, --decrypt, --create-folder, --batch, -no-hooks to the
> set... I cannot get a clear opinion (without wast^H^H^H^H spending
> excessive amount of time figuring these out) how this should be,
> therefore I'm inclined to the opinion that
>
> the current patch from John with simple --include-html could be applied,
> and in the future (if it is of anyone's interest) we update the parser
> allowing boolean --arg equal --arg=true. Then it is just how we decide
> to document these...

The argument parser we have allows NOTMUCH_OPT_BOOLEAN options to be
specified as --foo=(true|false) or simply --foo (for true). It is
already now just a matter of documentation, and I'm sure we're not
consistent.

We also have things like --no-hooks in notmuch new. I added that, and in
retrospect, that should be just --hooks=(true|false). I guess we left it
like that because the default is true. Perhaps we should amend the
argument parser to look for boolean option --foo if it encounters a
--no-foo option.

Why notmuch show has --exclude=(true|false) as a keyword option I don't
quite grasp; the --entire-thread option at least has a special
default. It might help some options if the parser had a way to tell if
it's seen some option or not.

Of course, none of this is really relevant to the patch in question!

BR,
Jani.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/1] test: test notmuch show --include-html option
  2013-07-02  0:19 cli: add --include-html option to notmuch show John Lenz
  2013-07-21 20:23 ` Tomi Ollila
@ 2013-08-24 15:29 ` Tomi Ollila
  2013-08-24 21:36   ` Mark Walters
  2013-08-27 11:04   ` David Bremner
  2013-08-27 11:03 ` cli: add --include-html option to notmuch show David Bremner
  2 siblings, 2 replies; 14+ messages in thread
From: Tomi Ollila @ 2013-08-24 15:29 UTC (permalink / raw)
  To: notmuch; +Cc: tomi.ollila

Test new --include-html option added to notmuch show command with
json output message parts containing text in latin1 and utf8 format.
---

this is test for id:notmuch-web-1372724382.450184839@www.wuzzeb.org asked
by Mark in id:87txifzexo.fsf@qmul.ac.uk

 test/multipart | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 81 insertions(+), 1 deletion(-)

diff --git a/test/multipart b/test/multipart
index 2033023..b40fa2c 100755
--- a/test/multipart
+++ b/test/multipart
@@ -647,4 +647,84 @@ notmuch show --format=raw --part=3 id:base64-part-with-crlf > crlf.out
 echo -n -e "\xEF\x0D\x0A" > crlf.expected
 test_expect_equal_file crlf.out crlf.expected
 
-test_done
\ No newline at end of file
+
+# The ISO-8859-1 encoding of U+00BD is a single byte: octal 275
+# (Portability note: Dollar-Single ($'...', ANSI C-style escape sequences)
+# quoting works on bash, ksh, zsh, *BSD sh but not on dash, ash nor busybox sh)
+readonly u_00bd_latin1=$'\275'
+
+# The Unicode fraction symbol 1/2 is U+00BD and is encoded
+# in UTF-8 as two bytes: octal 302 275
+readonly u_00bd_utf8=$'\302\275'
+
+cat <<EOF > ${MAIL_DIR}/include-html
+From: A <a@example.com>
+To: B <b@example.com>
+Subject: html message
+Date: Sat, 01 January 2000 00:00:00 +0000
+Message-ID: <htmlmessage>
+MIME-Version: 1.0
+Content-Type: multipart/alternative; boundary="==-=="
+
+--==-==
+Content-Type: text/html; charset=UTF-8
+
+<p>0.5 equals ${u_00bd_utf8}</p>
+
+--==-==
+Content-Type: text/html; charset=ISO-8859-1
+
+<p>0.5 equals ${u_00bd_latin1}</p>
+
+--==-==
+Content-Type: text/plain; charset=UTF-8
+
+0.5 equals ${u_00bd_utf8}
+
+--==-==--
+EOF
+
+notmuch new > /dev/null
+
+cat_expected_head ()
+{
+        cat <<EOF
+[[[{"id": "htmlmessage", "match":true, "excluded": false, "date_relative":"2000-01-01",
+   "timestamp": 946684800,
+   "filename": "${MAIL_DIR}/include-html",
+   "tags": ["inbox", "unread"],
+   "headers": { "Date": "Sat, 01 Jan 2000 00:00:00 +0000", "From": "A <a@example.com>",
+                "Subject": "html message", "To": "B <b@example.com>"},
+   "body": [{
+     "content-type": "multipart/alternative", "id": 1,
+EOF
+}
+
+cat_expected_head > EXPECTED.nohtml
+cat <<EOF >> EXPECTED.nohtml
+"content": [
+  { "id": 2, "content-charset": "UTF-8", "content-length": 21, "content-type": "text/html"},
+  { "id": 3, "content-charset": "ISO-8859-1", "content-length": 20, "content-type": "text/html"},
+  { "id": 4, "content-type": "text/plain", "content": "0.5 equals \\u00bd\\n"}
+]}]},[]]]]
+EOF
+
+# Both the UTF-8 and ISO-8859-1 part should have U+00BD
+cat_expected_head > EXPECTED.withhtml
+cat <<EOF >> EXPECTED.withhtml
+"content": [
+  { "id": 2, "content-type": "text/html", "content": "<p>0.5 equals \\u00bd</p>\\n"},
+  { "id": 3, "content-type": "text/html", "content": "<p>0.5 equals \\u00bd</p>\\n"},
+  { "id": 4, "content-type": "text/plain", "content": "0.5 equals \\u00bd\\n"}
+]}]},[]]]]
+EOF
+
+test_begin_subtest "html parts excluded by default"
+notmuch show --format=json id:htmlmessage > OUTPUT
+test_expect_equal_json "$(cat OUTPUT)" "$(cat EXPECTED.nohtml)"
+
+test_begin_subtest "html parts included"
+notmuch show --format=json --include-html id:htmlmessage > OUTPUT
+test_expect_equal_json "$(cat OUTPUT)" "$(cat EXPECTED.withhtml)"
+
+test_done
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/1] test: test notmuch show --include-html option
  2013-08-24 15:29 ` [PATCH 1/1] test: test notmuch show --include-html option Tomi Ollila
@ 2013-08-24 21:36   ` Mark Walters
  2013-08-27 11:04   ` David Bremner
  1 sibling, 0 replies; 14+ messages in thread
From: Mark Walters @ 2013-08-24 21:36 UTC (permalink / raw)
  To: Tomi Ollila, notmuch; +Cc: tomi.ollila


Hi

Thanks for this. The pair of patches
id:notmuch-web-1372724382.450184839@www.wuzzeb.org and
id:1377358170-20561-1-git-send-email-tomi.ollila@iki.fi LGTM +1

Best wishes

Mark

Tomi Ollila <tomi.ollila@iki.fi> writes:

> Test new --include-html option added to notmuch show command with
> json output message parts containing text in latin1 and utf8 format.
> ---
>
> this is test for id:notmuch-web-1372724382.450184839@www.wuzzeb.org asked
> by Mark in id:87txifzexo.fsf@qmul.ac.uk
>
>  test/multipart | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 81 insertions(+), 1 deletion(-)
>
> diff --git a/test/multipart b/test/multipart
> index 2033023..b40fa2c 100755
> --- a/test/multipart
> +++ b/test/multipart
> @@ -647,4 +647,84 @@ notmuch show --format=raw --part=3 id:base64-part-with-crlf > crlf.out
>  echo -n -e "\xEF\x0D\x0A" > crlf.expected
>  test_expect_equal_file crlf.out crlf.expected
>  
> -test_done
> \ No newline at end of file
> +
> +# The ISO-8859-1 encoding of U+00BD is a single byte: octal 275
> +# (Portability note: Dollar-Single ($'...', ANSI C-style escape sequences)
> +# quoting works on bash, ksh, zsh, *BSD sh but not on dash, ash nor busybox sh)
> +readonly u_00bd_latin1=$'\275'
> +
> +# The Unicode fraction symbol 1/2 is U+00BD and is encoded
> +# in UTF-8 as two bytes: octal 302 275
> +readonly u_00bd_utf8=$'\302\275'
> +
> +cat <<EOF > ${MAIL_DIR}/include-html
> +From: A <a@example.com>
> +To: B <b@example.com>
> +Subject: html message
> +Date: Sat, 01 January 2000 00:00:00 +0000
> +Message-ID: <htmlmessage>
> +MIME-Version: 1.0
> +Content-Type: multipart/alternative; boundary="==-=="
> +
> +--==-==
> +Content-Type: text/html; charset=UTF-8
> +
> +<p>0.5 equals ${u_00bd_utf8}</p>
> +
> +--==-==
> +Content-Type: text/html; charset=ISO-8859-1
> +
> +<p>0.5 equals ${u_00bd_latin1}</p>
> +
> +--==-==
> +Content-Type: text/plain; charset=UTF-8
> +
> +0.5 equals ${u_00bd_utf8}
> +
> +--==-==--
> +EOF
> +
> +notmuch new > /dev/null
> +
> +cat_expected_head ()
> +{
> +        cat <<EOF
> +[[[{"id": "htmlmessage", "match":true, "excluded": false, "date_relative":"2000-01-01",
> +   "timestamp": 946684800,
> +   "filename": "${MAIL_DIR}/include-html",
> +   "tags": ["inbox", "unread"],
> +   "headers": { "Date": "Sat, 01 Jan 2000 00:00:00 +0000", "From": "A <a@example.com>",
> +                "Subject": "html message", "To": "B <b@example.com>"},
> +   "body": [{
> +     "content-type": "multipart/alternative", "id": 1,
> +EOF
> +}
> +
> +cat_expected_head > EXPECTED.nohtml
> +cat <<EOF >> EXPECTED.nohtml
> +"content": [
> +  { "id": 2, "content-charset": "UTF-8", "content-length": 21, "content-type": "text/html"},
> +  { "id": 3, "content-charset": "ISO-8859-1", "content-length": 20, "content-type": "text/html"},
> +  { "id": 4, "content-type": "text/plain", "content": "0.5 equals \\u00bd\\n"}
> +]}]},[]]]]
> +EOF
> +
> +# Both the UTF-8 and ISO-8859-1 part should have U+00BD
> +cat_expected_head > EXPECTED.withhtml
> +cat <<EOF >> EXPECTED.withhtml
> +"content": [
> +  { "id": 2, "content-type": "text/html", "content": "<p>0.5 equals \\u00bd</p>\\n"},
> +  { "id": 3, "content-type": "text/html", "content": "<p>0.5 equals \\u00bd</p>\\n"},
> +  { "id": 4, "content-type": "text/plain", "content": "0.5 equals \\u00bd\\n"}
> +]}]},[]]]]
> +EOF
> +
> +test_begin_subtest "html parts excluded by default"
> +notmuch show --format=json id:htmlmessage > OUTPUT
> +test_expect_equal_json "$(cat OUTPUT)" "$(cat EXPECTED.nohtml)"
> +
> +test_begin_subtest "html parts included"
> +notmuch show --format=json --include-html id:htmlmessage > OUTPUT
> +test_expect_equal_json "$(cat OUTPUT)" "$(cat EXPECTED.withhtml)"
> +
> +test_done
> -- 
> 1.8.0
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cli: add --include-html option to notmuch show
  2013-07-02  0:19 cli: add --include-html option to notmuch show John Lenz
  2013-07-21 20:23 ` Tomi Ollila
  2013-08-24 15:29 ` [PATCH 1/1] test: test notmuch show --include-html option Tomi Ollila
@ 2013-08-27 11:03 ` David Bremner
  2 siblings, 0 replies; 14+ messages in thread
From: David Bremner @ 2013-08-27 11:03 UTC (permalink / raw)
  To: John Lenz, notmuch

John Lenz <lenz@math.uic.edu> writes:

> Therefore, this patch adds an --include-html option that causes the
> text/html parts to be included as part of the output of show.
>

pushed,

d

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/1] test: test notmuch show --include-html option
  2013-08-24 15:29 ` [PATCH 1/1] test: test notmuch show --include-html option Tomi Ollila
  2013-08-24 21:36   ` Mark Walters
@ 2013-08-27 11:04   ` David Bremner
  1 sibling, 0 replies; 14+ messages in thread
From: David Bremner @ 2013-08-27 11:04 UTC (permalink / raw)
  To: Tomi Ollila, notmuch; +Cc: tomi.ollila

Tomi Ollila <tomi.ollila@iki.fi> writes:

> Test new --include-html option added to notmuch show command with
> json output message parts containing text in latin1 and utf8 format.

pushed,

d

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-08-27 11:04 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-02  0:19 cli: add --include-html option to notmuch show John Lenz
2013-07-21 20:23 ` Tomi Ollila
2013-07-22 16:49   ` John Lenz
2013-07-25  2:36   ` John Lenz
2013-08-04 19:47     ` Tomi Ollila
2013-08-17 15:52       ` John Lenz
2013-08-18 11:25         ` Jani Nikula
2013-08-18 18:30           ` Tomi Ollila
2013-08-24  8:11             ` Mark Walters
2013-08-24 10:59             ` Jani Nikula
2013-08-24 15:29 ` [PATCH 1/1] test: test notmuch show --include-html option Tomi Ollila
2013-08-24 21:36   ` Mark Walters
2013-08-27 11:04   ` David Bremner
2013-08-27 11:03 ` cli: add --include-html option to notmuch show David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).