Searching through different charsets

unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed

* Searching through different charsets
@ 2012-02-22 17:10 Serge Z
  2012-02-24  0:31 ` Michal Sojka
  0 siblings, 1 reply; 15+ messages in thread
From: Serge Z @ 2012-02-22 17:10 UTC (permalink / raw)
  To: notmuch

Hello!

I've got the following problem: fetched emails can be in different encodings.
And searching a term typed in one encoding (system default) does not match the
same term in another encoding.

The solution, as I see, can be in preprocessing each incoming email to
"normalize" it and its encoding so that indexer will handle emails in system
encoding only. Could you please suggest something?

Another issue (not so much wanted but wanted too) is searching through html
messages without matching html tags.

This problem looks to be solvable by properly configured run-mailcap. Is there
such solution anywhere?

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Searching through different charsets
  2012-02-22 17:10 Searching through different charsets Serge Z
@ 2012-02-24  0:31 ` Michal Sojka
  2012-02-24  0:33   ` [PATCH] test: Add test for searching of uncommonly encoded messages Michal Sojka
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Sojka @ 2012-02-24  0:31 UTC (permalink / raw)
  To: Serge Z, notmuch

On Wed, 22 Feb 2012, Serge Z wrote:
> 
> Hello!
> 
> I've got the following problem: fetched emails can be in different encodings.
> And searching a term typed in one encoding (system default) does not match the
> same term in another encoding.
> 
> The solution, as I see, can be in preprocessing each incoming email to
> "normalize" it and its encoding so that indexer will handle emails in system
> encoding only. Could you please suggest something?

I can confirm this issue and sending a patch with test case (marked as
broken) for this. I expect the fix to be quite simple because all
encoding/docoding stuff is already implemented in gmime which is used by
notmuch when indexing.

> 
> Another issue (not so much wanted but wanted too) is searching through html
> messages without matching html tags.

I don't know whether somebody works on this or nor.

> This problem looks to be solvable by properly configured run-mailcap. Is there
> such solution anywhere?

I don't think that run-mailcap has anything to do with notmuch.

-Michal

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] test: Add test for searching of uncommonly encoded messages
  2012-02-24  0:31 ` Michal Sojka
@ 2012-02-24  0:33   ` Michal Sojka
  2012-02-24  4:29     ` Serge Z
                       ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Michal Sojka @ 2012-02-24  0:33 UTC (permalink / raw)
  To: notmuch

Emails that are encoded differently than as ASCII or UTF-8 are not
indexed properly by notmuch. It is not possible to search for non-ASCII
words within those messages.
---
 test/encoding    |    9 +++++++++
 test/test-lib.sh |    5 +++++
 2 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/test/encoding b/test/encoding
index 33259c1..3992b5c 100755
--- a/test/encoding
+++ b/test/encoding
@@ -21,4 +21,13 @@ irrelevant
 \fbody}
 \fmessage}"
 
+test_begin_subtest "Search for ISO-8859-2 encoded message"
+test_subtest_known_broken
+add_message '[content-type]="text/plain; charset=iso-8859-2"' \
+            '[content-transfer-encoding]=8bit' \
+            '[subject]="ISO-8859-2 encoded message"' \
+            "[body]=$'Czech word tu\350\362\341\350\350\355 means pinguin\'s.'" # ISO-8859-2 characters are generated by shell's escape sequences
+output=$(notmuch search tučňáččí 2>&1 | notmuch_show_sanitize)
+test_expect_equal "$output" "thread:0000000000000002   2001-01-05 [1/1] Notmuch Test Suite; ISO-8859-2 encoded message (inbox unread)"
+
 test_done
diff --git a/test/test-lib.sh b/test/test-lib.sh
index 063a2b2..2781506 100644
--- a/test/test-lib.sh
+++ b/test/test-lib.sh
@@ -356,6 +356,11 @@ ${additional_headers}"
 ${additional_headers}"
     fi
 
+    if [ ! -z "${template[content-transfer-encoding]}" ]; then
+	additional_headers="Content-Transfer-Encoding: ${template[content-transfer-encoding]}
+${additional_headers}"
+    fi
+
     # Note that in the way we're setting it above and using it below,
     # `additional_headers' will also serve as the header / body separator
     # (empty line in between).
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] test: Add test for searching of uncommonly encoded messages
  2012-02-24  0:33   ` [PATCH] test: Add test for searching of uncommonly encoded messages Michal Sojka
@ 2012-02-24  4:29     ` Serge Z
  2012-02-24  7:00       ` Michal Sojka
  2012-02-24  7:36     ` [PATCH 1/2] Convert non-UTF-8 parts to UTF-8 before indexing them Michal Sojka
  2012-02-29 11:55     ` [PATCH] test: Add test for searching of uncommonly encoded messages David Bremner
  2 siblings, 1 reply; 15+ messages in thread
From: Serge Z @ 2012-02-24  4:29 UTC (permalink / raw)
  To: notmuch

Quoting Michal Sojka (2012-02-24 04:33:15)
>Emails that are encoded differently than as ASCII or UTF-8 are not
>indexed properly by notmuch. It is not possible to search for non-ASCII
>words within those messages.

Ok. But we can preprocess each incoming message right after 'getmail' to
convert it from html to text and to utf8 encoding. One solution is to create a
seperate script for this and make gmail pipe all messages to this script, and
then to notmuch. But It would be better if maildir contains original messages
only, so the question is: can we make nomuch indexing engine to index
preprocessed message while maildir will contain original message - as it was
obtained?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] test: Add test for searching of uncommonly encoded messages
  2012-02-24  4:29     ` Serge Z
@ 2012-02-24  7:00       ` Michal Sojka
  2012-02-24  7:57         ` Serge Z
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Sojka @ 2012-02-24  7:00 UTC (permalink / raw)
  To: Serge Z, notmuch

On Fri, 24 Feb 2012, Serge Z wrote:
> 
> Quoting Michal Sojka (2012-02-24 04:33:15)
> >Emails that are encoded differently than as ASCII or UTF-8 are not
> >indexed properly by notmuch. It is not possible to search for non-ASCII
> >words within those messages.
> 
> Ok. But we can preprocess each incoming message right after 'getmail' to
> convert it from html to text and to utf8 encoding. One solution is to create a
> seperate script for this and make gmail pipe all messages to this script, and
> then to notmuch. But It would be better if maildir contains original messages
> only, so the question is: can we make nomuch indexing engine to index
> preprocessed message while maildir will contain original message - as it was
> obtained?

Hi,

I'm not big fan of adding "preprocessor". First, I thing that both
reasons you mention are actually bugs and it would be better to fix them
for everybody than requiring each user to configure some preprocessor.
Second, depending on what and how would your preprocessor do, the
initial mail indexing could be a way slower, which is also nothing that
people want.

Do you have any other use case for the preprocessor besides utf8 and
html->text conversions?

Cheers,
-Michal

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] Convert non-UTF-8 parts to UTF-8 before indexing them
  2012-02-24  0:33   ` [PATCH] test: Add test for searching of uncommonly encoded messages Michal Sojka
  2012-02-24  4:29     ` Serge Z
@ 2012-02-24  7:36     ` Michal Sojka
  2012-02-24  7:36       ` [PATCH 2/2] test: Remove 'broken' flag from encoding test Michal Sojka
                         ` (2 more replies)
  2012-02-29 11:55     ` [PATCH] test: Add test for searching of uncommonly encoded messages David Bremner
  2 siblings, 3 replies; 15+ messages in thread
From: Michal Sojka @ 2012-02-24  7:36 UTC (permalink / raw)
  To: notmuch

This fixes a bug that didn't allow to search for non-ASCII words such
parts. The code here was copied from show_text_part_content(), because
the show command already does the needed conversion when showing the
message.
---
 lib/index.cc |   15 +++++++++++++++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/lib/index.cc b/lib/index.cc
index d8f8b2b..e377732 100644
--- a/lib/index.cc
+++ b/lib/index.cc
@@ -315,6 +315,7 @@ _index_mime_part (notmuch_message_t *message,
     GByteArray *byte_array;
     GMimeContentDisposition *disposition;
     char *body;
+    const char *charset;
 
     if (! part) {
 	fprintf (stderr, "Warning: Not indexing empty mime part.\n");
@@ -390,6 +391,20 @@ _index_mime_part (notmuch_message_t *message,
     g_mime_stream_filter_add (GMIME_STREAM_FILTER (filter),
 			      discard_uuencode_filter);
 
+    charset = g_mime_object_get_content_type_parameter (part, "charset");
+    if (charset) {
+	GMimeFilter *charset_filter;
+	charset_filter = g_mime_filter_charset_new (charset, "UTF-8");
+	/* This result can be NULL for things like "unknown-8bit".
+	 * Don't set a NULL filter as that makes GMime print
+	 * annoying assertion-failure messages on stderr. */
+	if (charset_filter) {
+	    g_mime_stream_filter_add (GMIME_STREAM_FILTER (filter),
+				      charset_filter);
+	    g_object_unref (charset_filter);
+	}
+    }
+
     wrapper = g_mime_part_get_content_object (GMIME_PART (part));
     if (wrapper)
 	g_mime_data_wrapper_write_to_stream (wrapper, filter);
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] test: Remove 'broken' flag from encoding test
  2012-02-24  7:36     ` [PATCH 1/2] Convert non-UTF-8 parts to UTF-8 before indexing them Michal Sojka
@ 2012-02-24  7:36       ` Michal Sojka
  2012-02-25  4:33       ` [PATCH 1/2] Convert non-UTF-8 parts to UTF-8 before indexing them Austin Clements
  2012-02-29 11:55       ` David Bremner
  2 siblings, 0 replies; 15+ messages in thread
From: Michal Sojka @ 2012-02-24  7:36 UTC (permalink / raw)
  To: notmuch

---
 test/encoding |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/test/encoding b/test/encoding
index 3992b5c..f0d073c 100755
--- a/test/encoding
+++ b/test/encoding
@@ -22,7 +22,6 @@ irrelevant
 \fmessage}"
 
 test_begin_subtest "Search for ISO-8859-2 encoded message"
-test_subtest_known_broken
 add_message '[content-type]="text/plain; charset=iso-8859-2"' \
             '[content-transfer-encoding]=8bit' \
             '[subject]="ISO-8859-2 encoded message"' \
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] test: Add test for searching of uncommonly encoded messages
  2012-02-24  7:00       ` Michal Sojka
@ 2012-02-24  7:57         ` Serge Z
  2012-02-24  8:38           ` Michal Sojka
  0 siblings, 1 reply; 15+ messages in thread
From: Serge Z @ 2012-02-24  7:57 UTC (permalink / raw)
  To: notmuch


Quoting Michal Sojka (2012-02-24 11:00:02)
>On Fri, 24 Feb 2012, Serge Z wrote:
>> 
>> Quoting Michal Sojka (2012-02-24 04:33:15)
>> >Emails that are encoded differently than as ASCII or UTF-8 are not
>> >indexed properly by notmuch. It is not possible to search for non-ASCII
>> >words within those messages.
>> 
>> Ok. But we can preprocess each incoming message right after 'getmail' to
>> convert it from html to text and to utf8 encoding. One solution is to create a
>> seperate script for this and make gmail pipe all messages to this script, and
>> then to notmuch. But It would be better if maildir contains original messages
>> only, so the question is: can we make nomuch indexing engine to index
>> preprocessed message while maildir will contain original message - as it was
>> obtained?
>
>Hi,
>
>I'm not big fan of adding "preprocessor". First, I thing that both
>reasons you mention are actually bugs and it would be better to fix them
>for everybody than requiring each user to configure some preprocessor.
>Second, depending on what and how would your preprocessor do, the
>initial mail indexing could be a way slower, which is also nothing that
>people want.
>
>Do you have any other use case for the preprocessor besides utf8 and
>html->text conversions?
>
>Cheers,
>-Michal

Well, I don't want to add any external preprocessor too.

This may be considered as an architectural decision: search engine should not
access messages directly, but through some preprocessing layer which would
handle the case of different encodings in body and headers, RFC2047-encoded
headers (if this is not handled yet) etc.

Anyway, this solution imho would be nice to be concluded inside a separate
library which would be useful for notmuch clients as well as other mail
indexing engines. Or an existing library should be looked for.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] test: Add test for searching of uncommonly encoded messages
  2012-02-24  7:57         ` Serge Z
@ 2012-02-24  8:38           ` Michal Sojka
  2012-02-25  8:36             ` Serge Z
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Sojka @ 2012-02-24  8:38 UTC (permalink / raw)
  To: Serge Z, notmuch

On Fri, 24 Feb 2012, Serge Z wrote:
> 
> Quoting Michal Sojka (2012-02-24 11:00:02)
> >I'm not big fan of adding "preprocessor". First, I thing that both
> >reasons you mention are actually bugs and it would be better to fix them
> >for everybody than requiring each user to configure some preprocessor.
> >Second, depending on what and how would your preprocessor do, the
> >initial mail indexing could be a way slower, which is also nothing that
> >people want.
> >
> >Do you have any other use case for the preprocessor besides utf8 and
> >html->text conversions?
> >
> >Cheers,
> >-Michal
> 
> Well, I don't want to add any external preprocessor too.
> 
> This may be considered as an architectural decision: search engine should not
> access messages directly, but through some preprocessing layer which would
> handle the case of different encodings in body and headers, RFC2047-encoded
> headers (if this is not handled yet) etc.
> 
> Anyway, this solution imho would be nice to be concluded inside a separate
> library

Yes, this library is called gmime and notmuch already make use of it.

-Michal

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] Convert non-UTF-8 parts to UTF-8 before indexing them
  2012-02-24  7:36     ` [PATCH 1/2] Convert non-UTF-8 parts to UTF-8 before indexing them Michal Sojka
  2012-02-24  7:36       ` [PATCH 2/2] test: Remove 'broken' flag from encoding test Michal Sojka
@ 2012-02-25  4:33       ` Austin Clements
  2012-02-29 11:55       ` David Bremner
  2 siblings, 0 replies; 15+ messages in thread
From: Austin Clements @ 2012-02-25  4:33 UTC (permalink / raw)
  To: Michal Sojka; +Cc: notmuch

LGTM.  I'm assuming this interacts with the uuencoding filter in the
right order (I don't see how any other order could be correct), but
don't actually know.

Quoth Michal Sojka on Feb 24 at  8:36 am:
> This fixes a bug that didn't allow to search for non-ASCII words such
> parts. The code here was copied from show_text_part_content(), because
> the show command already does the needed conversion when showing the
> message.
> ---
>  lib/index.cc |   15 +++++++++++++++
>  1 files changed, 15 insertions(+), 0 deletions(-)
> 
> diff --git a/lib/index.cc b/lib/index.cc
> index d8f8b2b..e377732 100644
> --- a/lib/index.cc
> +++ b/lib/index.cc
> @@ -315,6 +315,7 @@ _index_mime_part (notmuch_message_t *message,
>      GByteArray *byte_array;
>      GMimeContentDisposition *disposition;
>      char *body;
> +    const char *charset;
>  
>      if (! part) {
>  	fprintf (stderr, "Warning: Not indexing empty mime part.\n");
> @@ -390,6 +391,20 @@ _index_mime_part (notmuch_message_t *message,
>      g_mime_stream_filter_add (GMIME_STREAM_FILTER (filter),
>  			      discard_uuencode_filter);
>  
> +    charset = g_mime_object_get_content_type_parameter (part, "charset");
> +    if (charset) {
> +	GMimeFilter *charset_filter;
> +	charset_filter = g_mime_filter_charset_new (charset, "UTF-8");
> +	/* This result can be NULL for things like "unknown-8bit".
> +	 * Don't set a NULL filter as that makes GMime print
> +	 * annoying assertion-failure messages on stderr. */
> +	if (charset_filter) {
> +	    g_mime_stream_filter_add (GMIME_STREAM_FILTER (filter),
> +				      charset_filter);
> +	    g_object_unref (charset_filter);
> +	}
> +    }
> +
>      wrapper = g_mime_part_get_content_object (GMIME_PART (part));
>      if (wrapper)
>  	g_mime_data_wrapper_write_to_stream (wrapper, filter);

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] test: Add test for searching of uncommonly encoded messages
  2012-02-24  8:38           ` Michal Sojka
@ 2012-02-25  8:36             ` Serge Z
  2012-02-26  9:33               ` Double decoded text/html parts (was: [PATCH] test: Add test for searching of uncommonly encoded messages) Michal Sojka
  0 siblings, 1 reply; 15+ messages in thread
From: Serge Z @ 2012-02-25  8:36 UTC (permalink / raw)
  To: notmuch

Hi!
I've struck another problem:

I've got an html/text email with body encoded with cp1251.
Its encoding is mentioned in both Content-type: email header and html <meta>
tag. So when the client tries to display it with external html2text converter,
The message is decoded twice: first by client, second by html2text (I use w3m).

As I understand, notmuch (while indexing this message) decodes it once and
index it in the right way (though including html tags to index). But what if
the message contains no "charset" option in Content-Type email header but
contain <meta> content-type tag with charset noted? Should such message be
considered as being composed wrong or it should be indexed with diving into
html details (content-type)?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Double decoded text/html parts (was: [PATCH] test: Add test for searching of uncommonly encoded messages)
  2012-02-25  8:36             ` Serge Z
@ 2012-02-26  9:33               ` Michal Sojka
  2012-02-26 10:20                 ` Serge Z
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Sojka @ 2012-02-26  9:33 UTC (permalink / raw)
  To: Serge Z, notmuch

On Sat, 25 Feb 2012, Serge Z wrote:
> 
> Hi!
> I've struck another problem:
> 
> I've got an html/text email with body encoded with cp1251.
> Its encoding is mentioned in both Content-type: email header and html <meta>
> tag. So when the client tries to display it with external html2text converter,
> The message is decoded twice: first by client, second by html2text (I
> use w3m).

Right. After my analysis of the problem (see below) it seems there is no
trivial solution for this.

> As I understand, notmuch (while indexing this message) decodes it once and
> index it in the right way (though including html tags to index). But what if
> the message contains no "charset" option in Content-Type email header but
> contain <meta> content-type tag with charset noted?

This should not happen. It violates RFC 2046, section 4.1.2.

> Should such message be considered as being composed wrong or it should
> be indexed with diving into html details (content-type)?

I don't think it's wrongly composed and it should be even correctly
indexed (with my patch). The problem is when you view such a message
with an external HTML viewer.

In my mailbox I can find two different types of text/html parts. First,
the parts that contain complete HTML document including all headers and
especially <meta http-equiv="content-type" content="text/html; ...">.
Such parts could be passed to external HTML viewer without any decoding
by notmuch.

The second type is text/html part that does not contain any HTML
headers. Passing such a part to an external HTML viewer undecoded would
require it to guess the correct charset from the content.

AFAIK Firefox users can set fallback charset (used for HTML documents
with unknown charset) in the preferences, but I don't know what other
browsers would do. In particular, do you know how w3m behaves when
charset is not specified?

In any way, if we want notmuch to do the right thing, we should analyze
the content of text/html parts and decide whether to decode the part or
not. Perhaps, a simple heuristic could be to search the content of the
part for strings "charset=" and "encoding=" and if any is found, notmuch
wouldn't decode that part. Otherwise it will decode it according to
Content-Type header.

As a curiosity, I found the following in one of my emails. Note that two
different encodings (iso-8859-2 and windows-1250) are specified at the
same time :) That's the reason why I think that fixing the problem won't
be trivial.

Content-Type: text/html; charset="iso-8859-2"
Content-Transfer-Encoding: 8bit

<?xml version="1.0" encoding="windows-1250" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-2" />

Cheers,
-Michal

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Double decoded text/html parts (was: [PATCH] test: Add test for searching of uncommonly encoded messages)
  2012-02-26  9:33               ` Double decoded text/html parts (was: [PATCH] test: Add test for searching of uncommonly encoded messages) Michal Sojka
@ 2012-02-26 10:20                 ` Serge Z
  0 siblings, 0 replies; 15+ messages in thread
From: Serge Z @ 2012-02-26 10:20 UTC (permalink / raw)
  To: notmuch


This works:
w3m -o document_charset=windows-1251 test.html

It says that w3m should suppose windows-1251 encoding if no html-meta
content-type tag given.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] test: Add test for searching of uncommonly encoded messages
  2012-02-24  0:33   ` [PATCH] test: Add test for searching of uncommonly encoded messages Michal Sojka
  2012-02-24  4:29     ` Serge Z
  2012-02-24  7:36     ` [PATCH 1/2] Convert non-UTF-8 parts to UTF-8 before indexing them Michal Sojka
@ 2012-02-29 11:55     ` David Bremner
  2 siblings, 0 replies; 15+ messages in thread
From: David Bremner @ 2012-02-29 11:55 UTC (permalink / raw)
  To: Michal Sojka, notmuch

On Fri, 24 Feb 2012 01:33:15 +0100, Michal Sojka <sojkam1@fel.cvut.cz> wrote:
> Emails that are encoded differently than as ASCII or UTF-8 are not
> indexed properly by notmuch. It is not possible to search for non-ASCII
> words within those messages.

pushed

d

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] Convert non-UTF-8 parts to UTF-8 before indexing them
  2012-02-24  7:36     ` [PATCH 1/2] Convert non-UTF-8 parts to UTF-8 before indexing them Michal Sojka
  2012-02-24  7:36       ` [PATCH 2/2] test: Remove 'broken' flag from encoding test Michal Sojka
  2012-02-25  4:33       ` [PATCH 1/2] Convert non-UTF-8 parts to UTF-8 before indexing them Austin Clements
@ 2012-02-29 11:55       ` David Bremner
  2 siblings, 0 replies; 15+ messages in thread
From: David Bremner @ 2012-02-29 11:55 UTC (permalink / raw)
  To: Michal Sojka, notmuch

On Fri, 24 Feb 2012 08:36:22 +0100, Michal Sojka <sojkam1@fel.cvut.cz> wrote:
> This fixes a bug that didn't allow to search for non-ASCII words such
> parts. The code here was copied from show_text_part_content(), because
> the show command already does the needed conversion when showing the
> message.

pushed both,

d

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-02-29 11:55 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-22 17:10 Searching through different charsets Serge Z
2012-02-24  0:31 ` Michal Sojka
2012-02-24  0:33   ` [PATCH] test: Add test for searching of uncommonly encoded messages Michal Sojka
2012-02-24  4:29     ` Serge Z
2012-02-24  7:00       ` Michal Sojka
2012-02-24  7:57         ` Serge Z
2012-02-24  8:38           ` Michal Sojka
2012-02-25  8:36             ` Serge Z
2012-02-26  9:33               ` Double decoded text/html parts (was: [PATCH] test: Add test for searching of uncommonly encoded messages) Michal Sojka
2012-02-26 10:20                 ` Serge Z
2012-02-24  7:36     ` [PATCH 1/2] Convert non-UTF-8 parts to UTF-8 before indexing them Michal Sojka
2012-02-24  7:36       ` [PATCH 2/2] test: Remove 'broken' flag from encoding test Michal Sojka
2012-02-25  4:33       ` [PATCH 1/2] Convert non-UTF-8 parts to UTF-8 before indexing them Austin Clements
2012-02-29 11:55       ` David Bremner
2012-02-29 11:55     ` [PATCH] test: Add test for searching of uncommonly encoded messages David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).