* [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init()
@ 2013-09-10 18:51 Jani Nikula
2013-09-10 19:31 ` Daniel Kahn Gillmor
2013-09-11 2:02 ` [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() Austin Clements
0 siblings, 2 replies; 13+ messages in thread
From: Jani Nikula @ 2013-09-10 18:51 UTC (permalink / raw)
To: notmuch
As explained by Jeffrey Stedfast, the author of GMime, quoted in [1]:
> Passing the GMIME_ENABLE_RFC2047_WORKAROUNDS flag to g_mime_init()
> *should* solve the decoding problem mentioned in the thread. This
> flag should be safe to pass into g_mime_init() without any bad side
> effects and my unit tests do test that code-path.
The thread being referred to is [2].
[1] id:87bo56viyo.fsf@nikula.org
[2] id:08cb1dcd-c5db-4e33-8b09-7730cb3d59a2@gmail.com
---
lib/database.cc | 2 +-
lib/index.cc | 2 +-
lib/message-file.c | 2 +-
notmuch.c | 2 +-
4 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/lib/database.cc b/lib/database.cc
index 5cc0765..bb4f180 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -655,7 +655,7 @@ notmuch_database_open (const char *path,
/* Initialize gmime */
if (! initialized) {
- g_mime_init (0);
+ g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS);
initialized = 1;
}
diff --git a/lib/index.cc b/lib/index.cc
index a2edd6d..78c18cf 100644
--- a/lib/index.cc
+++ b/lib/index.cc
@@ -440,7 +440,7 @@ _notmuch_message_index_file (notmuch_message_t *message,
static bool mbox_warning = false;
if (! initialized) {
- g_mime_init (0);
+ g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS);
initialized = 1;
}
diff --git a/lib/message-file.c b/lib/message-file.c
index 4d9af89..a2850c2 100644
--- a/lib/message-file.c
+++ b/lib/message-file.c
@@ -228,7 +228,7 @@ notmuch_message_file_get_header (notmuch_message_file_t *message,
is_received = (strcmp(header_desired,"received") == 0);
if (! initialized) {
- g_mime_init (0);
+ g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS);
initialized = 1;
}
diff --git a/notmuch.c b/notmuch.c
index 78d29a8..7300c21 100644
--- a/notmuch.c
+++ b/notmuch.c
@@ -264,7 +264,7 @@ main (int argc, char *argv[])
local = talloc_new (NULL);
- g_mime_init (0);
+ g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS);
#if !GLIB_CHECK_VERSION(2, 35, 1)
g_type_init ();
#endif
--
1.8.4.rc3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init()
2013-09-10 18:51 [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() Jani Nikula
@ 2013-09-10 19:31 ` Daniel Kahn Gillmor
2013-09-10 22:35 ` Austin Clements
2013-09-11 2:02 ` [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() Austin Clements
1 sibling, 1 reply; 13+ messages in thread
From: Daniel Kahn Gillmor @ 2013-09-10 19:31 UTC (permalink / raw)
To: notmuch
[-- Attachment #1: Type: text/plain, Size: 671 bytes --]
On 09/10/2013 02:51 PM, Jani Nikula wrote:
> As explained by Jeffrey Stedfast, the author of GMime, quoted in [1]:
>
>> Passing the GMIME_ENABLE_RFC2047_WORKAROUNDS flag to g_mime_init()
>> *should* solve the decoding problem mentioned in the thread. This
>> flag should be safe to pass into g_mime_init() without any bad side
>> effects and my unit tests do test that code-path.
the result of doing this is that there will become legitimately-crafted
subject lines that are now unrepresentable.
I'm always leery of trying to improve support for data that doesn't
follow the standards at the expense of data that *does* follow the
standards.
--dkg
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 1027 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init()
2013-09-10 19:31 ` Daniel Kahn Gillmor
@ 2013-09-10 22:35 ` Austin Clements
2013-09-10 22:50 ` [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() a test Daniel Kahn Gillmor
0 siblings, 1 reply; 13+ messages in thread
From: Austin Clements @ 2013-09-10 22:35 UTC (permalink / raw)
To: Daniel Kahn Gillmor; +Cc: notmuch
Quoth Daniel Kahn Gillmor on Sep 10 at 3:31 pm:
> On 09/10/2013 02:51 PM, Jani Nikula wrote:
> > As explained by Jeffrey Stedfast, the author of GMime, quoted in [1]:
> >
> >> Passing the GMIME_ENABLE_RFC2047_WORKAROUNDS flag to g_mime_init()
> >> *should* solve the decoding problem mentioned in the thread. This
> >> flag should be safe to pass into g_mime_init() without any bad side
> >> effects and my unit tests do test that code-path.
>
> the result of doing this is that there will become legitimately-crafted
> subject lines that are now unrepresentable.
>
> I'm always leery of trying to improve support for data that doesn't
> follow the standards at the expense of data that *does* follow the
> standards.
>
> --dkg
I haven't looked at exactly what workarounds this enables, but if it's
what I'm guessing (RFC 2047 escapes in the middle of RFC 2822 text
tokens), are there really subject lines that this will misinterpret
that weren't obviously crafted to break the workaround? The RFC 2047
escape sequence was deliberately designed to be obscure, since RFC
2047 itself caused previously "standards-compliant" subject lines to
potentially be interpreted differently.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() a test
2013-09-10 22:35 ` Austin Clements
@ 2013-09-10 22:50 ` Daniel Kahn Gillmor
2013-09-11 1:51 ` Austin Clements
2013-09-11 18:21 ` Jani Nikula
0 siblings, 2 replies; 13+ messages in thread
From: Daniel Kahn Gillmor @ 2013-09-10 22:50 UTC (permalink / raw)
To: Austin Clements; +Cc: notmuch
[-- Attachment #1: Type: text/plain, Size: 1332 bytes --]
On 09/10/2013 06:35 PM, Austin Clements wrote:
> I haven't looked at exactly what workarounds this enables, but if it's
> what I'm guessing (RFC 2047 escapes in the middle of RFC 2822 text
> tokens), are there really subject lines that this will misinterpret
> that weren't obviously crafted to break the workaround?
not to get all meta, but i imagine subject lines that refer an example
of this particular issue (e.g. when talking about RFC 2047) will break
;) I'm trying one variant here.
> The RFC 2047
> escape sequence was deliberately designed to be obscure, since RFC
> 2047 itself caused previously "standards-compliant" subject lines to
> potentially be interpreted differently.
right, and it was designed explicitly to put the boundary markers atword
boundaries, and not in the middle of a word (i think that's what this is
all about, right?). so implementations which put the boundary markers
in the middle of a word, or which include whitespace within the encoded
text, aren't speaking RFC 2047.
anyway, if there's a rough consensus to go forward with this, i'm not
about to block it. I understand that a large part of the business of
being an MUA is working around other people's bugs instead of expecting
them to fix them :/ I just don't like mis-rendering other text.
--dkg
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 1027 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() a test
2013-09-10 22:50 ` [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() a test Daniel Kahn Gillmor
@ 2013-09-11 1:51 ` Austin Clements
2013-09-11 18:21 ` Jani Nikula
1 sibling, 0 replies; 13+ messages in thread
From: Austin Clements @ 2013-09-11 1:51 UTC (permalink / raw)
To: Daniel Kahn Gillmor; +Cc: notmuch
On Tue, 10 Sep 2013, Daniel Kahn Gillmor <dkg@fifthhorseman.net> wrote:
> On 09/10/2013 06:35 PM, Austin Clements wrote:
>
>> I haven't looked at exactly what workarounds this enables, but if it's
>> what I'm guessing (RFC 2047 escapes in the middle of RFC 2822 text
>> tokens), are there really subject lines that this will misinterpret
>> that weren't obviously crafted to break the workaround?
>
> not to get all meta, but i imagine subject lines that refer an example
> of this particular issue (e.g. when talking about RFC 2047) will break
> ;) I'm trying one variant here.
That's cheating. ]:--8) Though, I wonder, you mentioned in your
original email that there would be subject lines that are
*unrepresentable* given the worked-around RFC 2047. Did you mean that?
If so, can you provide an example? Isn't it always possible to, say,
RFC 2047 escape the whole subject, which would be decoded correctly
whether the decoder strictly adheres to RFC 2047 or uses the
workarounds?
(Speaking of which, it looks like message-mode does *not* RFC 2047
encode the subject if it contains text that could be mistaken for an
encoded-word, so such subjects won't get round-tripped correctly.)
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init()
2013-09-10 18:51 [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() Jani Nikula
2013-09-10 19:31 ` Daniel Kahn Gillmor
@ 2013-09-11 2:02 ` Austin Clements
2013-09-11 17:36 ` [PATCH v2 1/2] test: add known broken tests for known broken RFC 2047 encodings Jani Nikula
2013-09-11 17:40 ` [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() Jani Nikula
1 sibling, 2 replies; 13+ messages in thread
From: Austin Clements @ 2013-09-11 2:02 UTC (permalink / raw)
To: Jani Nikula, notmuch
LGTM in principle, though I'd like to see a test of some of the
malformed RFC 2047 that this lets us decode. Is there a summary
somewhere of exactly what these workarounds enable?
This isn't directly related to this patch, but is there a reason we
g_mime_init in so many different places? Both the CLI and
notmuch_database_open I can understand because the CLI also uses GMime
and should be sure it's initialized. Maaaybe
notmuch_message_file_get_header because notmuch_message_file
theoretically independent of the database, even though I don't think
it's possible to call into it without first calling
notmuch_database_open. But _notmuch_message_index_file?
On Tue, 10 Sep 2013, Jani Nikula <jani@nikula.org> wrote:
> As explained by Jeffrey Stedfast, the author of GMime, quoted in [1]:
>
>> Passing the GMIME_ENABLE_RFC2047_WORKAROUNDS flag to g_mime_init()
>> *should* solve the decoding problem mentioned in the thread. This
>> flag should be safe to pass into g_mime_init() without any bad side
>> effects and my unit tests do test that code-path.
>
> The thread being referred to is [2].
>
> [1] id:87bo56viyo.fsf@nikula.org
> [2] id:08cb1dcd-c5db-4e33-8b09-7730cb3d59a2@gmail.com
> ---
> lib/database.cc | 2 +-
> lib/index.cc | 2 +-
> lib/message-file.c | 2 +-
> notmuch.c | 2 +-
> 4 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/lib/database.cc b/lib/database.cc
> index 5cc0765..bb4f180 100644
> --- a/lib/database.cc
> +++ b/lib/database.cc
> @@ -655,7 +655,7 @@ notmuch_database_open (const char *path,
>
> /* Initialize gmime */
> if (! initialized) {
> - g_mime_init (0);
> + g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS);
> initialized = 1;
> }
>
> diff --git a/lib/index.cc b/lib/index.cc
> index a2edd6d..78c18cf 100644
> --- a/lib/index.cc
> +++ b/lib/index.cc
> @@ -440,7 +440,7 @@ _notmuch_message_index_file (notmuch_message_t *message,
> static bool mbox_warning = false;
>
> if (! initialized) {
> - g_mime_init (0);
> + g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS);
> initialized = 1;
> }
>
> diff --git a/lib/message-file.c b/lib/message-file.c
> index 4d9af89..a2850c2 100644
> --- a/lib/message-file.c
> +++ b/lib/message-file.c
> @@ -228,7 +228,7 @@ notmuch_message_file_get_header (notmuch_message_file_t *message,
> is_received = (strcmp(header_desired,"received") == 0);
>
> if (! initialized) {
> - g_mime_init (0);
> + g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS);
> initialized = 1;
> }
>
> diff --git a/notmuch.c b/notmuch.c
> index 78d29a8..7300c21 100644
> --- a/notmuch.c
> +++ b/notmuch.c
> @@ -264,7 +264,7 @@ main (int argc, char *argv[])
>
> local = talloc_new (NULL);
>
> - g_mime_init (0);
> + g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS);
> #if !GLIB_CHECK_VERSION(2, 35, 1)
> g_type_init ();
> #endif
> --
> 1.8.4.rc3
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 1/2] test: add known broken tests for known broken RFC 2047 encodings
2013-09-11 2:02 ` [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() Austin Clements
@ 2013-09-11 17:36 ` Jani Nikula
2013-09-11 17:36 ` [PATCH v2 2/2] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() Jani Nikula
` (3 more replies)
2013-09-11 17:40 ` [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() Jani Nikula
1 sibling, 4 replies; 13+ messages in thread
From: Jani Nikula @ 2013-09-11 17:36 UTC (permalink / raw)
To: notmuch; +Cc: Daniel Kahn Gillmor
Some common broken RFC 2047 encodings that we currently let gmime
parse strictly. We could tell gmime to be forgiving in what it accepts
as RFC 2047 encoding, making these tests pass.
---
test/encoding | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/test/encoding b/test/encoding
index 2e1326e..7372b6b 100755
--- a/test/encoding
+++ b/test/encoding
@@ -29,4 +29,22 @@ add_message '[content-type]="text/plain; charset=iso-8859-2"' \
output=$(notmuch search tučňáččí 2>&1 | notmuch_show_sanitize)
test_expect_equal "$output" "thread:0000000000000002 2001-01-05 [1/1] Notmuch Test Suite; ISO-8859-2 encoded message (inbox unread)"
+test_begin_subtest "RFC 2047 encoded word with spaces"
+test_subtest_known_broken
+add_message '[subject]="=?utf-8?q?encoded word with spaces?="'
+output=$(notmuch search id:${gen_msg_id} 2>&1 | notmuch_show_sanitize)
+test_expect_equal "$output" "thread:0000000000000003 2001-01-05 [1/1] Notmuch Test Suite; encoded word with spaces (inbox unread)"
+
+test_begin_subtest "RFC 2047 encoded words back to back"
+test_subtest_known_broken
+add_message '[subject]="=?utf-8?q?encoded-words-back?==?utf-8?q?to-back?="'
+output=$(notmuch search id:${gen_msg_id} 2>&1 | notmuch_show_sanitize)
+test_expect_equal "$output" "thread:0000000000000004 2001-01-05 [1/1] Notmuch Test Suite; encoded-words-backto-back (inbox unread)"
+
+test_begin_subtest "RFC 2047 encoded words without space before or after"
+test_subtest_known_broken
+add_message '[subject]="=?utf-8?q?encoded?=word without=?utf-8?q?space?=" '
+output=$(notmuch search id:${gen_msg_id} 2>&1 | notmuch_show_sanitize)
+test_expect_equal "$output" "thread:0000000000000005 2001-01-05 [1/1] Notmuch Test Suite; encodedword withoutspace (inbox unread)"
+
test_done
--
1.8.4.rc3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 2/2] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init()
2013-09-11 17:36 ` [PATCH v2 1/2] test: add known broken tests for known broken RFC 2047 encodings Jani Nikula
@ 2013-09-11 17:36 ` Jani Nikula
2013-09-11 18:37 ` [PATCH v2 1/2] test: add known broken tests for known broken RFC 2047 encodings Austin Clements
` (2 subsequent siblings)
3 siblings, 0 replies; 13+ messages in thread
From: Jani Nikula @ 2013-09-11 17:36 UTC (permalink / raw)
To: notmuch; +Cc: Daniel Kahn Gillmor
As explained by Jeffrey Stedfast, the author of GMime, quoted in [1]:
> Passing the GMIME_ENABLE_RFC2047_WORKAROUNDS flag to g_mime_init()
> *should* solve the decoding problem mentioned in the thread. This
> flag should be safe to pass into g_mime_init() without any bad side
> effects and my unit tests do test that code-path.
The thread being referred to is [2].
[1] id:87bo56viyo.fsf@nikula.org
[2] id:08cb1dcd-c5db-4e33-8b09-7730cb3d59a2@gmail.com
---
lib/database.cc | 2 +-
lib/index.cc | 2 +-
lib/message-file.c | 2 +-
notmuch.c | 2 +-
test/encoding | 3 ---
5 files changed, 4 insertions(+), 7 deletions(-)
diff --git a/lib/database.cc b/lib/database.cc
index 5cc0765..bb4f180 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -655,7 +655,7 @@ notmuch_database_open (const char *path,
/* Initialize gmime */
if (! initialized) {
- g_mime_init (0);
+ g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS);
initialized = 1;
}
diff --git a/lib/index.cc b/lib/index.cc
index a2edd6d..78c18cf 100644
--- a/lib/index.cc
+++ b/lib/index.cc
@@ -440,7 +440,7 @@ _notmuch_message_index_file (notmuch_message_t *message,
static bool mbox_warning = false;
if (! initialized) {
- g_mime_init (0);
+ g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS);
initialized = 1;
}
diff --git a/lib/message-file.c b/lib/message-file.c
index 4d9af89..a2850c2 100644
--- a/lib/message-file.c
+++ b/lib/message-file.c
@@ -228,7 +228,7 @@ notmuch_message_file_get_header (notmuch_message_file_t *message,
is_received = (strcmp(header_desired,"received") == 0);
if (! initialized) {
- g_mime_init (0);
+ g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS);
initialized = 1;
}
diff --git a/notmuch.c b/notmuch.c
index 78d29a8..7300c21 100644
--- a/notmuch.c
+++ b/notmuch.c
@@ -264,7 +264,7 @@ main (int argc, char *argv[])
local = talloc_new (NULL);
- g_mime_init (0);
+ g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS);
#if !GLIB_CHECK_VERSION(2, 35, 1)
g_type_init ();
#endif
diff --git a/test/encoding b/test/encoding
index 7372b6b..8609652 100755
--- a/test/encoding
+++ b/test/encoding
@@ -30,19 +30,16 @@ output=$(notmuch search tučňáččí 2>&1 | notmuch_show_sanitize)
test_expect_equal "$output" "thread:0000000000000002 2001-01-05 [1/1] Notmuch Test Suite; ISO-8859-2 encoded message (inbox unread)"
test_begin_subtest "RFC 2047 encoded word with spaces"
-test_subtest_known_broken
add_message '[subject]="=?utf-8?q?encoded word with spaces?="'
output=$(notmuch search id:${gen_msg_id} 2>&1 | notmuch_show_sanitize)
test_expect_equal "$output" "thread:0000000000000003 2001-01-05 [1/1] Notmuch Test Suite; encoded word with spaces (inbox unread)"
test_begin_subtest "RFC 2047 encoded words back to back"
-test_subtest_known_broken
add_message '[subject]="=?utf-8?q?encoded-words-back?==?utf-8?q?to-back?="'
output=$(notmuch search id:${gen_msg_id} 2>&1 | notmuch_show_sanitize)
test_expect_equal "$output" "thread:0000000000000004 2001-01-05 [1/1] Notmuch Test Suite; encoded-words-backto-back (inbox unread)"
test_begin_subtest "RFC 2047 encoded words without space before or after"
-test_subtest_known_broken
add_message '[subject]="=?utf-8?q?encoded?=word without=?utf-8?q?space?=" '
output=$(notmuch search id:${gen_msg_id} 2>&1 | notmuch_show_sanitize)
test_expect_equal "$output" "thread:0000000000000005 2001-01-05 [1/1] Notmuch Test Suite; encodedword withoutspace (inbox unread)"
--
1.8.4.rc3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init()
2013-09-11 2:02 ` [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() Austin Clements
2013-09-11 17:36 ` [PATCH v2 1/2] test: add known broken tests for known broken RFC 2047 encodings Jani Nikula
@ 2013-09-11 17:40 ` Jani Nikula
1 sibling, 0 replies; 13+ messages in thread
From: Jani Nikula @ 2013-09-11 17:40 UTC (permalink / raw)
To: Austin Clements, notmuch
On Wed, 11 Sep 2013, Austin Clements <amdragon@MIT.EDU> wrote:
> LGTM in principle, though I'd like to see a test of some of the
> malformed RFC 2047 that this lets us decode. Is there a summary
> somewhere of exactly what these workarounds enable?
Not that I know of; looking into gmime source it's mostly about encoded
words without surrounding space, or space within encoded words.
v2 now has known broken tests for known broken encodings...
> This isn't directly related to this patch, but is there a reason we
> g_mime_init in so many different places? Both the CLI and
> notmuch_database_open I can understand because the CLI also uses GMime
> and should be sure it's initialized. Maaaybe
> notmuch_message_file_get_header because notmuch_message_file
> theoretically independent of the database, even though I don't think
> it's possible to call into it without first calling
> notmuch_database_open. But _notmuch_message_index_file?
I noted the same, but decided it's another patch, another time.
BR,
Jani.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() a test
2013-09-10 22:50 ` [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() a test Daniel Kahn Gillmor
2013-09-11 1:51 ` Austin Clements
@ 2013-09-11 18:21 ` Jani Nikula
1 sibling, 0 replies; 13+ messages in thread
From: Jani Nikula @ 2013-09-11 18:21 UTC (permalink / raw)
To: Daniel Kahn Gillmor, Austin Clements; +Cc: notmuch
On Wed, 11 Sep 2013, Daniel Kahn Gillmor <dkg@fifthhorseman.net> wrote:
> On 09/10/2013 06:35 PM, Austin Clements wrote:
>
>> I haven't looked at exactly what workarounds this enables, but if it's
>> what I'm guessing (RFC 2047 escapes in the middle of RFC 2822 text
>> tokens), are there really subject lines that this will misinterpret
>> that weren't obviously crafted to break the workaround?
>
> not to get all meta, but i imagine subject lines that refer an example
> of this particular issue (e.g. when talking about RFC 2047) will break
> ;) I'm trying one variant here.
The meta reply here, running the patch. The broken RFC 2047 got
liberally accepted. :)
>> The RFC 2047
>> escape sequence was deliberately designed to be obscure, since RFC
>> 2047 itself caused previously "standards-compliant" subject lines to
>> potentially be interpreted differently.
>
> right, and it was designed explicitly to put the boundary markers atword
> boundaries, and not in the middle of a word (i think that's what this is
> all about, right?). so implementations which put the boundary markers
> in the middle of a word, or which include whitespace within the encoded
> text, aren't speaking RFC 2047.
>
> anyway, if there's a rough consensus to go forward with this, i'm not
> about to block it. I understand that a large part of the business of
> being an MUA is working around other people's bugs instead of expecting
> them to fix them :/ I just don't like mis-rendering other text.
I share your concern. Yet the amount of email with unintentionally
broken encoding is much greater than the amount of email that has
intentional character sequences that resemble broken encodings. Which is
why I'm willing to sacrifice the latter to improve the user experience
for majority of users. YMMV.
BR,
Jani.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/2] test: add known broken tests for known broken RFC 2047 encodings
2013-09-11 17:36 ` [PATCH v2 1/2] test: add known broken tests for known broken RFC 2047 encodings Jani Nikula
2013-09-11 17:36 ` [PATCH v2 2/2] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() Jani Nikula
@ 2013-09-11 18:37 ` Austin Clements
2013-09-11 19:57 ` Tomi Ollila
2013-09-14 17:21 ` David Bremner
3 siblings, 0 replies; 13+ messages in thread
From: Austin Clements @ 2013-09-11 18:37 UTC (permalink / raw)
To: Jani Nikula; +Cc: notmuch, Daniel Kahn Gillmor
v2 LGTM.
Quoth Jani Nikula on Sep 11 at 8:36 pm:
> Some common broken RFC 2047 encodings that we currently let gmime
> parse strictly. We could tell gmime to be forgiving in what it accepts
> as RFC 2047 encoding, making these tests pass.
> ---
> test/encoding | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/test/encoding b/test/encoding
> index 2e1326e..7372b6b 100755
> --- a/test/encoding
> +++ b/test/encoding
> @@ -29,4 +29,22 @@ add_message '[content-type]="text/plain; charset=iso-8859-2"' \
> output=$(notmuch search tučňáččí 2>&1 | notmuch_show_sanitize)
> test_expect_equal "$output" "thread:0000000000000002 2001-01-05 [1/1] Notmuch Test Suite; ISO-8859-2 encoded message (inbox unread)"
>
> +test_begin_subtest "RFC 2047 encoded word with spaces"
> +test_subtest_known_broken
> +add_message '[subject]="=?utf-8?q?encoded word with spaces?="'
> +output=$(notmuch search id:${gen_msg_id} 2>&1 | notmuch_show_sanitize)
> +test_expect_equal "$output" "thread:0000000000000003 2001-01-05 [1/1] Notmuch Test Suite; encoded word with spaces (inbox unread)"
> +
> +test_begin_subtest "RFC 2047 encoded words back to back"
> +test_subtest_known_broken
> +add_message '[subject]="=?utf-8?q?encoded-words-back?==?utf-8?q?to-back?="'
> +output=$(notmuch search id:${gen_msg_id} 2>&1 | notmuch_show_sanitize)
> +test_expect_equal "$output" "thread:0000000000000004 2001-01-05 [1/1] Notmuch Test Suite; encoded-words-backto-back (inbox unread)"
> +
> +test_begin_subtest "RFC 2047 encoded words without space before or after"
> +test_subtest_known_broken
> +add_message '[subject]="=?utf-8?q?encoded?=word without=?utf-8?q?space?=" '
> +output=$(notmuch search id:${gen_msg_id} 2>&1 | notmuch_show_sanitize)
> +test_expect_equal "$output" "thread:0000000000000005 2001-01-05 [1/1] Notmuch Test Suite; encodedword withoutspace (inbox unread)"
> +
> test_done
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/2] test: add known broken tests for known broken RFC 2047 encodings
2013-09-11 17:36 ` [PATCH v2 1/2] test: add known broken tests for known broken RFC 2047 encodings Jani Nikula
2013-09-11 17:36 ` [PATCH v2 2/2] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() Jani Nikula
2013-09-11 18:37 ` [PATCH v2 1/2] test: add known broken tests for known broken RFC 2047 encodings Austin Clements
@ 2013-09-11 19:57 ` Tomi Ollila
2013-09-14 17:21 ` David Bremner
3 siblings, 0 replies; 13+ messages in thread
From: Tomi Ollila @ 2013-09-11 19:57 UTC (permalink / raw)
To: Jani Nikula, notmuch
On Wed, Sep 11 2013, Jani Nikula <jani@nikula.org> wrote:
> Some common broken RFC 2047 encodings that we currently let gmime
> parse strictly. We could tell gmime to be forgiving in what it accepts
> as RFC 2047 encoding, making these tests pass.
> ---
V2 LGTM.
Tomi
> test/encoding | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/test/encoding b/test/encoding
> index 2e1326e..7372b6b 100755
> --- a/test/encoding
> +++ b/test/encoding
> @@ -29,4 +29,22 @@ add_message '[content-type]="text/plain; charset=iso-8859-2"' \
> output=$(notmuch search tučňáččí 2>&1 | notmuch_show_sanitize)
> test_expect_equal "$output" "thread:0000000000000002 2001-01-05 [1/1] Notmuch Test Suite; ISO-8859-2 encoded message (inbox unread)"
>
> +test_begin_subtest "RFC 2047 encoded word with spaces"
> +test_subtest_known_broken
> +add_message '[subject]="=?utf-8?q?encoded word with spaces?="'
> +output=$(notmuch search id:${gen_msg_id} 2>&1 | notmuch_show_sanitize)
> +test_expect_equal "$output" "thread:0000000000000003 2001-01-05 [1/1] Notmuch Test Suite; encoded word with spaces (inbox unread)"
> +
> +test_begin_subtest "RFC 2047 encoded words back to back"
> +test_subtest_known_broken
> +add_message '[subject]="=?utf-8?q?encoded-words-back?==?utf-8?q?to-back?="'
> +output=$(notmuch search id:${gen_msg_id} 2>&1 | notmuch_show_sanitize)
> +test_expect_equal "$output" "thread:0000000000000004 2001-01-05 [1/1] Notmuch Test Suite; encoded-words-backto-back (inbox unread)"
> +
> +test_begin_subtest "RFC 2047 encoded words without space before or after"
> +test_subtest_known_broken
> +add_message '[subject]="=?utf-8?q?encoded?=word without=?utf-8?q?space?=" '
> +output=$(notmuch search id:${gen_msg_id} 2>&1 | notmuch_show_sanitize)
> +test_expect_equal "$output" "thread:0000000000000005 2001-01-05 [1/1] Notmuch Test Suite; encodedword withoutspace (inbox unread)"
> +
> test_done
> --
> 1.8.4.rc3
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/2] test: add known broken tests for known broken RFC 2047 encodings
2013-09-11 17:36 ` [PATCH v2 1/2] test: add known broken tests for known broken RFC 2047 encodings Jani Nikula
` (2 preceding siblings ...)
2013-09-11 19:57 ` Tomi Ollila
@ 2013-09-14 17:21 ` David Bremner
3 siblings, 0 replies; 13+ messages in thread
From: David Bremner @ 2013-09-14 17:21 UTC (permalink / raw)
To: Jani Nikula, notmuch; +Cc: Daniel Kahn Gillmor
Jani Nikula <jani@nikula.org> writes:
> Some common broken RFC 2047 encodings that we currently let gmime
> parse strictly. We could tell gmime to be forgiving in what it accepts
> as RFC 2047 encoding, making these tests pass.
Pushed this version.
d
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2013-09-14 17:21 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-10 18:51 [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() Jani Nikula
2013-09-10 19:31 ` Daniel Kahn Gillmor
2013-09-10 22:35 ` Austin Clements
2013-09-10 22:50 ` [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() a test Daniel Kahn Gillmor
2013-09-11 1:51 ` Austin Clements
2013-09-11 18:21 ` Jani Nikula
2013-09-11 2:02 ` [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() Austin Clements
2013-09-11 17:36 ` [PATCH v2 1/2] test: add known broken tests for known broken RFC 2047 encodings Jani Nikula
2013-09-11 17:36 ` [PATCH v2 2/2] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() Jani Nikula
2013-09-11 18:37 ` [PATCH v2 1/2] test: add known broken tests for known broken RFC 2047 encodings Austin Clements
2013-09-11 19:57 ` Tomi Ollila
2013-09-14 17:21 ` David Bremner
2013-09-11 17:40 ` [PATCH] lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init() Jani Nikula
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).