From: David Bremner <david@tethera.net>
To: Carl Worth <cworth@cworth.org>,
Mark Walters <markwalters1009@gmail.com>,
notmuch <notmuch@notmuchmail.org>
Subject: Re: [RFC PATCH] Re: excessive thread fusing
Date: Sun, 20 Apr 2014 21:59:26 +0900 [thread overview]
Message-ID: <87fvl8upg1.fsf@maritornes.cs.unb.ca> (raw)
In-Reply-To: <87oazwjq1e.fsf@yoom.home.cworth.org>
Carl Worth <cworth@cworth.org> writes:
>
> Another idea would be to trigger specifically on common forms. Judging
> From the samples in this particular thread, it seems like a workable
> heuristic would be:
>
> If the In-Reply-To header begins with '<':
>
> Parse that initial portion as a message ID
>
> Else if it ends with '>':
>
> Parse that final portion as a message ID
>
> Else
>
> Ignore this garbage-valued header.
>
using the hacky script below, I scanned my own mail collection of about
300k messages. I can make the following observations
- I have some RFC compliant in-reply-to's with multiple ids
- I have have a non-trivial number of Message from $NAME <address> of $date <id>
- I didn't see any cases where using the last angle bracketed thing
would fail.
- I did see some some cases where the header starts with '<' but the
matching '>' was missing
- I also noticed some rfc2047 encoding of in-reply-to headers.
######################################################################
# hacky script follows
dir=$1
echo Scanning $dir
tempdir=$(mktemp -d)
echo Writing to ${tempdir}
find $dir -exec sh -c "formail -c -xIn-reply-to < {}" \; \
> ${tempdir}/ids
sed -e 's/\t/ /' -e 's/ */ /g' -e 's/<[^ ]*>/<id>/g' -e 's/(.*)/(comment)/' < ${tempdir}/ids | sort | uniq | tee ${tempdir}/report
prev parent reply other threads:[~2014-04-20 12:59 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-19 12:33 excessive thread fusing David Bremner
2014-04-19 17:52 ` Eric
2014-04-19 21:04 ` Andrei POPESCU
2014-04-20 16:48 ` Austin Clements
2014-04-20 17:46 ` Austin Clements
2014-04-20 7:14 ` [RFC PATCH] " Mark Walters
[not found] ` <87oazwjq1e.fsf@yoom.home.cworth.org>
2014-04-20 12:03 ` Mark Walters
2014-04-21 7:20 ` Mark Walters
2014-04-21 16:20 ` Austin Clements
2022-01-01 0:26 ` David Bremner
2014-04-20 12:59 ` David Bremner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87fvl8upg1.fsf@maritornes.cs.unb.ca \
--to=david@tethera.net \
--cc=cworth@cworth.org \
--cc=markwalters1009@gmail.com \
--cc=notmuch@notmuchmail.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).