From: James Vasile <james@hackervisions.org>
To: notmuch <notmuch@notmuchmail.org>
Subject: [PATCH] Remove/replace vertical whitespace in subject header field body.
Date: Wed, 16 Mar 2011 21:44:28 -0400 [thread overview]
Message-ID: <87ipvifrlv.fsf@softwarefreedom.org> (raw)
[-- Attachment #1: Type: text/plain, Size: 3642 bytes --]
RFC 822 specifies that headers are one-liners of ASCII:
> The field-body may be composed of any ASCII characters, except CR or
> LF. (While CR and/or LF may be present in the actual text, they are
> removed by the action of unfolding the field.)
RFC 5335 allows UTF-8 in header field bodies, but as I read the docs,
the RFC 822 specification that they end up as one-liners still applies.
RFC 5322 describes folding and unfolding as follows:
> Each header field is logically a single line of characters comprising
> the field name, the colon, and the field body. For convenience
> however, and to deal with the 998/78 character limitations per line,
> the field body portion of a header field can be split into a
> multiple-line representation; this is called "folding". The general
> rule is that wherever this specification allows for folding white
> space (not simply WSP characters), a CRLF may be inserted before any
> WSP.
...
> The process of moving from this folded multiple-line representation of
> a header field to its single line representation is called
> "unfolding". Unfolding is accomplished by simply removing any CRLF
> that is immediately followed by WSP.
Again, unfolded subjects should be one-liners.
An email was sent to me from pingg.com (I think it's a pretentious
version of evite) came with a subject of
"=?utf-8?Q?bring_small_items_for_a_pi=C3=B1ata=21=21=21=21=0A?=", which
"notmuch search" displays as "Subject: bring small items for a
piñata!!!!" with a \n at the end. This befuddles the emacs UI ("Error:
Unexpected output from notmuch search:"). I've attached an email that
reproduces the error.
I don't think ending the subject with a utf-8-encoded 0x0A followed by
the usual CRLF is RFC-compliant. Still, notmuch should surely follow
the deplorable "accept liberally/emit conservatively" doctrine.
Here is a patch that trims leading and trailing whitespace from subjects
and replaces internal non-space, non-horizontal-tab whitespace with
spaces. It fixes the problem described in this message.
---
lib/thread.cc | 36 ++++++++++++++++++++++++++++++++----
1 files changed, 32 insertions(+), 4 deletions(-)
diff --git a/lib/thread.cc b/lib/thread.cc
index 5190a66..7a816ea 100644
--- a/lib/thread.cc
+++ b/lib/thread.cc
@@ -266,6 +266,34 @@ _thread_add_message (notmuch_thread_t *thread,
}
}
+/* Remove leading/trailing whitespace and replace internal vertical
+ * whitespace with spaces.
+ */
+static char *
+rectify_whitespace (char *str)
+{
+ char *last;
+ char *curr;
+
+ while (isspace (*str))
+ str++;
+
+ if (*str == 0)
+ return str;
+
+ last = str + strlen(str) - 1;
+ while (last > str && isspace (*last))
+ last--;
+
+ curr = str;
+ do
+ if ((*curr >= 10) && (*curr <= 13))
+ *curr = 32; //space
+ while (curr++ < last);
+
+ return str;
+}
+
static void
_thread_set_subject_from_message (notmuch_thread_t *thread,
notmuch_message_t *message)
@@ -282,11 +310,11 @@ _thread_set_subject_from_message (notmuch_thread_t *thread,
(strncasecmp (subject, "Vs: ", 4) == 0) ||
(strncasecmp (subject, "Sv: ", 4) == 0)) {
- cleaned_subject = talloc_strndup (thread,
- subject + 4,
- strlen(subject) - 4);
+ cleaned_subject = rectify_whitespace(talloc_strndup (thread,
+ subject + 4,
+ strlen(subject) - 4));
} else {
- cleaned_subject = talloc_strdup (thread, subject);
+ cleaned_subject = rectify_whitespace(talloc_strdup (thread, subject));
}
if (thread->subject)
--
1.7.2.3
[-- Attachment #2: malformed_subject --]
[-- Type: application/octet-stream, Size: 352 bytes --]
Date: Fri, 11 Mar 2011 18:40:00 +0000
From: "redacted" <host@invite.pingg.com>
To: redacted@example.com
Message-Id: <20110311183749.526771.31453.9841841@sender.pingg.com>
Subject: =?utf-8?Q?bring_small_items_for_a_pi=C3=B1ata=21=21=21=21=0A?=
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Ignore this.
next reply other threads:[~2011-03-17 1:44 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-17 1:44 James Vasile [this message]
2011-03-17 1:55 ` [PATCH] replace null terminator in string James Vasile
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ipvifrlv.fsf@softwarefreedom.org \
--to=james@hackervisions.org \
--cc=notmuch@notmuchmail.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).