* [PATCH] Remove/replace vertical whitespace in subject header field body.
@ 2011-03-17 1:44 James Vasile
2011-03-17 1:55 ` [PATCH] replace null terminator in string James Vasile
0 siblings, 1 reply; 2+ messages in thread
From: James Vasile @ 2011-03-17 1:44 UTC (permalink / raw)
To: notmuch
[-- Attachment #1: Type: text/plain, Size: 3642 bytes --]
RFC 822 specifies that headers are one-liners of ASCII:
> The field-body may be composed of any ASCII characters, except CR or
> LF. (While CR and/or LF may be present in the actual text, they are
> removed by the action of unfolding the field.)
RFC 5335 allows UTF-8 in header field bodies, but as I read the docs,
the RFC 822 specification that they end up as one-liners still applies.
RFC 5322 describes folding and unfolding as follows:
> Each header field is logically a single line of characters comprising
> the field name, the colon, and the field body. For convenience
> however, and to deal with the 998/78 character limitations per line,
> the field body portion of a header field can be split into a
> multiple-line representation; this is called "folding". The general
> rule is that wherever this specification allows for folding white
> space (not simply WSP characters), a CRLF may be inserted before any
> WSP.
...
> The process of moving from this folded multiple-line representation of
> a header field to its single line representation is called
> "unfolding". Unfolding is accomplished by simply removing any CRLF
> that is immediately followed by WSP.
Again, unfolded subjects should be one-liners.
An email was sent to me from pingg.com (I think it's a pretentious
version of evite) came with a subject of
"=?utf-8?Q?bring_small_items_for_a_pi=C3=B1ata=21=21=21=21=0A?=", which
"notmuch search" displays as "Subject: bring small items for a
piñata!!!!" with a \n at the end. This befuddles the emacs UI ("Error:
Unexpected output from notmuch search:"). I've attached an email that
reproduces the error.
I don't think ending the subject with a utf-8-encoded 0x0A followed by
the usual CRLF is RFC-compliant. Still, notmuch should surely follow
the deplorable "accept liberally/emit conservatively" doctrine.
Here is a patch that trims leading and trailing whitespace from subjects
and replaces internal non-space, non-horizontal-tab whitespace with
spaces. It fixes the problem described in this message.
---
lib/thread.cc | 36 ++++++++++++++++++++++++++++++++----
1 files changed, 32 insertions(+), 4 deletions(-)
diff --git a/lib/thread.cc b/lib/thread.cc
index 5190a66..7a816ea 100644
--- a/lib/thread.cc
+++ b/lib/thread.cc
@@ -266,6 +266,34 @@ _thread_add_message (notmuch_thread_t *thread,
}
}
+/* Remove leading/trailing whitespace and replace internal vertical
+ * whitespace with spaces.
+ */
+static char *
+rectify_whitespace (char *str)
+{
+ char *last;
+ char *curr;
+
+ while (isspace (*str))
+ str++;
+
+ if (*str == 0)
+ return str;
+
+ last = str + strlen(str) - 1;
+ while (last > str && isspace (*last))
+ last--;
+
+ curr = str;
+ do
+ if ((*curr >= 10) && (*curr <= 13))
+ *curr = 32; //space
+ while (curr++ < last);
+
+ return str;
+}
+
static void
_thread_set_subject_from_message (notmuch_thread_t *thread,
notmuch_message_t *message)
@@ -282,11 +310,11 @@ _thread_set_subject_from_message (notmuch_thread_t *thread,
(strncasecmp (subject, "Vs: ", 4) == 0) ||
(strncasecmp (subject, "Sv: ", 4) == 0)) {
- cleaned_subject = talloc_strndup (thread,
- subject + 4,
- strlen(subject) - 4);
+ cleaned_subject = rectify_whitespace(talloc_strndup (thread,
+ subject + 4,
+ strlen(subject) - 4));
} else {
- cleaned_subject = talloc_strdup (thread, subject);
+ cleaned_subject = rectify_whitespace(talloc_strdup (thread, subject));
}
if (thread->subject)
--
1.7.2.3
[-- Attachment #2: malformed_subject --]
[-- Type: application/octet-stream, Size: 352 bytes --]
Date: Fri, 11 Mar 2011 18:40:00 +0000
From: "redacted" <host@invite.pingg.com>
To: redacted@example.com
Message-Id: <20110311183749.526771.31453.9841841@sender.pingg.com>
Subject: =?utf-8?Q?bring_small_items_for_a_pi=C3=B1ata=21=21=21=21=0A?=
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Ignore this.
^ permalink raw reply related [flat|nested] 2+ messages in thread
* [PATCH] replace null terminator in string
2011-03-17 1:44 [PATCH] Remove/replace vertical whitespace in subject header field body James Vasile
@ 2011-03-17 1:55 ` James Vasile
0 siblings, 0 replies; 2+ messages in thread
From: James Vasile @ 2011-03-17 1:55 UTC (permalink / raw)
To: notmuch
In order to make the prior patch work for trailing whitespace, we also need this one.
---
lib/thread.cc | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/lib/thread.cc b/lib/thread.cc
index 7a816ea..54fde2b 100644
--- a/lib/thread.cc
+++ b/lib/thread.cc
@@ -291,6 +291,8 @@ rectify_whitespace (char *str)
*curr = 32; //space
while (curr++ < last);
+ *(last+1) = 0;
+
return str;
}
--
1.7.2.3
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2011-03-17 1:55 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-17 1:44 [PATCH] Remove/replace vertical whitespace in subject header field body James Vasile
2011-03-17 1:55 ` [PATCH] replace null terminator in string James Vasile
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).