unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* bug: "no top level messages" crash on Zen email loops
@ 2018-03-19 13:25 Antoine Beaupré
  2018-03-19 16:36 ` David Bremner
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Antoine Beaupré @ 2018-03-19 13:25 UTC (permalink / raw)
  To: notmuch

[-- Attachment #1: Type: text/plain, Size: 5249 bytes --]

Hi!

Here's a fun bug for you Xapian tricksters.

Two emails attached make notmuch crash when trying to display the
folder.

$ notmuch show thread:0000000000000001
Internal error: Thread 0000000000000001 has no toplevel messages.
 (notmuch-show.c:1012)

Those are the two messages:

$ notmuch search --output messages  thread:0000000000000001
id:9379QM5Z39_5aa86b134fcfb_174033fc97a2cb98c7198d_sprut@zendesk.com
id:9379QM5Z39_5aa86b1350504_174eb3fc97a2cb98c71674_sprut@zendesk.com

`notmuch show` on either messages crashes the same way:

$ notmuch show id:9379QM5Z39_5aa86b1350504_174eb3fc97a2cb98c71674_sprut@zendesk.com
Internal error: Thread 0000000000000001 has no toplevel messages.
 (notmuch-show.c:1012)

Note that displaying the messages weith `--format raw` doesn't crash, so
it's really the thread structure that's broken. Obviously, emacs can't
display the messages either and doesn't touch the unread tags when
trying to load the message, which is to be expected I guess.

Xapian is also unhappy with the database created by notmuch new:

$ xapian-check gitlab/.notmuch/xapian/
docdata:
blocksize=8K items=1 firstunused=1 revision=7 levels=0 root=0
B-tree checked okay
docdata table structure checked OK

termlist:
blocksize=8K items=12 firstunused=4 revision=7 levels=0 root=3
xapian-check: DatabaseError: 1 unused block(s) missing from the free list, first is 0

Valgrind is not particularly unhappy with notmuch, so it doesn't seem
like a memory error:

==26723== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

I tried to track this down in gdb, i got as far as finding that, in
`notmuch_thread_get_toplevel_messages`, the `list` object is corrupt (?)
already (`list->head == NULL`) which obviously makes it hard to, er,
list messages in a thread. :p I lost the exact backtrace and so on, but
I'm not sure there's much we can get from gdb: it seems the problem
might be in notmuch-new, but I'm a little out of my depth to debug
*that* without any further pointers.

This is with 0.26-1~bpo9+1 on Debian stretch, but I can also reproduce
with 0.23 on another Debian stretch machine, using a similar mail
spool.

My guess is that those messages are somewhat special: notice how the
reply-to identifiers *loop* between the two messages?

Message one:

Message-ID: <9379QM5Z39_5aa86b134fcfb_174033fc97a2cb98c7198d_sprut@zendesk.com>
In-Reply-To: <9379QM5Z39@zendesk.com>
 <9379QM5Z39_5aa86b1350504_174eb3fc97a2cb98c71674_sprut@zendesk.com>

Message two:

Message-ID: <9379QM5Z39_5aa86b1350504_174eb3fc97a2cb98c71674_sprut@zendesk.com>
In-Reply-To: <9379QM5Z39@zendesk.com>
 <9379QM5Z39_5aa86b134fcfb_174033fc97a2cb98c7198d_sprut@zendesk.com>

And indeed, a mailbox with only *one* of those messages doesn't cause
the crash. But also: the original thread is now made of *three*
messages, and taking any one of the two messages above with that *third*
message doesn't cause the crash:

Message three:

Message-ID: <9379QM5Z39_5aaf79c126a_94233ffb30ecb9982187c0_sprut@zendesk.com>
In-Reply-To: <9379QM5Z39@zendesk.com>
 <9379QM5Z39_5aa86b1350504_174eb3fc97a2cb98c71674_sprut@zendesk.com>

This message also shows correctly threaded with message two if present,
otherwise the thread is (obviously) broken with only message one and
three.

Mutt displays those messages as a "normal" three-level thread:

   1   ! mar 14 GitLab Support  (4,7K) Your GitLab support request has been received
   2   ! mar 14 GitLab Support  (4,5K) └>comments not showing up?
   3 O ! mar 19 XXXXXXXXXXXXXXX (7,9K)  └>[GitLab, Inc.] Re: comments not showing up?

The numbers on the left (1, 2, 3) correspond to the labeling I used
above as well (one, two, three).

The third message is not included here because it's an actual reply from
a human from GitLab (yay gitlab! :) which I'd need approval before
sharing here. The first message is an automated response so I thought it
was fair game to share publicly. The second is a copy of my own message
which triggered the autoreply, which is probably the source of the
loop. The software generating this mess is Zendesk.com. I haven't had
that problem with other interactions with Zendesk, maybe because I
never talked with a Zendesk that sent autoreplies.

To reproduce this, untar the attachment anywhere (say $HOME) and then
hack a notmuch config file pointing there, e.g.:

$ diff .notmuch-config*
15c15
< path=/home/anarcat/Maildir/
---
> path=/home/anarcat/gitlab/

Then point notmuch to that config (export
NOTMUCH_CONFIG=~/.notmuch-config-test) and run notmuch new (which should
find only two messages). Then run the commands from the above of this
email, of course. :)

Thanks for any input,

A.

PS: I must say I am grateful and impressed by the reliability of
notmuch. I've been using notmuch for *years* now and it's the *first*
time, for as long as I remember, that I had to go back to mutt to read
email. So kudos to the team, good job. :)

-- 
Si les élections n'étaient pas indispensables à la prospérité du
capital, on ne nous les servirait pas partout, toujours, à coup de
fric, à coup de flics.
                        - René Binamé

[-- Attachment #2: zendesk-email-loop.tgz --]
[-- Type: application/x-gtar-compressed, Size: 5959 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: bug: "no top level messages" crash on Zen email loops
  2018-03-19 13:25 bug: "no top level messages" crash on Zen email loops Antoine Beaupré
@ 2018-03-19 16:36 ` David Bremner
  2018-03-19 17:50   ` Antoine Beaupré
  2018-03-20 21:22 ` [PATCH 1/2] test: two new messages for the 'broken' corpus David Bremner
  2018-04-28 13:28 ` bug: "no top level messages" crash on Zen email loops David Bremner
  2 siblings, 1 reply; 22+ messages in thread
From: David Bremner @ 2018-03-19 16:36 UTC (permalink / raw)
  To: Antoine Beaupré, notmuch

Antoine Beaupré <anarcat@orangeseeds.org> writes:

> Hi!
>
> Here's a fun bug for you Xapian tricksters.
>
> Two emails attached make notmuch crash when trying to display the
> folder.
>
> $ notmuch show thread:0000000000000001
> Internal error: Thread 0000000000000001 has no toplevel messages.
>  (notmuch-show.c:1012)
>
> Those are the two messages:
>
> $ notmuch search --output messages  thread:0000000000000001
> id:9379QM5Z39_5aa86b134fcfb_174033fc97a2cb98c7198d_sprut@zendesk.com
> id:9379QM5Z39_5aa86b1350504_174eb3fc97a2cb98c71674_sprut@zendesk.com
>
> `notmuch show` on either messages crashes the same way:
>
> $ notmuch show id:9379QM5Z39_5aa86b1350504_174eb3fc97a2cb98c71674_sprut@zendesk.com
> Internal error: Thread 0000000000000001 has no toplevel messages.
>  (notmuch-show.c:1012)

I can't duplicate that part.  

>
> Note that displaying the messages weith `--format raw` doesn't crash, so
> it's really the thread structure that's broken. Obviously, emacs can't
> display the messages either and doesn't touch the unread tags when
> trying to load the message, which is to be expected I guess.
>
> Xapian is also unhappy with the database created by notmuch new:
>
> $ xapian-check gitlab/.notmuch/xapian/
> docdata:
> blocksize=8K items=1 firstunused=1 revision=7 levels=0 root=0
> B-tree checked okay
> docdata table structure checked OK
>
> termlist:
> blocksize=8K items=12 firstunused=4 revision=7 levels=0 root=3
> xapian-check: DatabaseError: 1 unused block(s) missing from the free list, first is 0

Surprisingly (to me), I can duplicate that, so that's something to
pursue.

d

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: bug: "no top level messages" crash on Zen email loops
  2018-03-19 16:36 ` David Bremner
@ 2018-03-19 17:50   ` Antoine Beaupré
  2018-03-19 17:56     ` Antoine Beaupré
  2018-03-19 20:03     ` bug: "no top level messages" crash on Zen email loops David Bremner
  0 siblings, 2 replies; 22+ messages in thread
From: Antoine Beaupré @ 2018-03-19 17:50 UTC (permalink / raw)
  To: David Bremner, notmuch

On 2018-03-19 13:36:49, David Bremner wrote:
> Antoine Beaupré <anarcat@orangeseeds.org> writes:
>
>> Hi!
>>
>> Here's a fun bug for you Xapian tricksters.
>>
>> Two emails attached make notmuch crash when trying to display the
>> folder.
>>
>> $ notmuch show thread:0000000000000001
>> Internal error: Thread 0000000000000001 has no toplevel messages.
>>  (notmuch-show.c:1012)
>>
>> Those are the two messages:
>>
>> $ notmuch search --output messages  thread:0000000000000001
>> id:9379QM5Z39_5aa86b134fcfb_174033fc97a2cb98c7198d_sprut@zendesk.com
>> id:9379QM5Z39_5aa86b1350504_174eb3fc97a2cb98c71674_sprut@zendesk.com
>>
>> `notmuch show` on either messages crashes the same way:
>>
>> $ notmuch show id:9379QM5Z39_5aa86b1350504_174eb3fc97a2cb98c71674_sprut@zendesk.com
>> Internal error: Thread 0000000000000001 has no toplevel messages.
>>  (notmuch-show.c:1012)
>
> I can't duplicate that part.  

That's very strange. I can reproduce this on my workstation here, but
taking the tarball I sent in the original message, I can't reproduce
anymore. So something changed! I suspect it's the "flags" on the
message. I have "F" everywhere because I'm experimenting with syncing
(badly) my inbox tag everywhere, through the flagged tag. All post-new
hooks stuff that shouldn't affect this because it's in a new
environment, but it does change the flag on the files sometimes.

So attached is a *new* reproducer, with which I *can* reproduce in a
clean VM with notmuch from stretch (0.23?).

To reproduce, with a `debian/stretch64` vagrant VM:

host$ vagrant init debian/stretch64 && vagrant up && vagrant ssh
guest$ sudo apt install notmuch
guest$ notmuch setup # pick all defaults
guest$ wget $url_of_the_reproducer
guest$ tar zxfv zendesk-mail-loop2.tgz
guest$ mv gitlab mail # to put it where notmuch expects
guest$ notmuch new
guest$ notmuch show thread:0000000000000001
Internal error: Thread 0000000000000001 has no toplevel messages.
 (notmuch-show.c:957)

I can reproduce this reproducibly here now.

Phew, that is definitely weird! For what it's worth, here's the diff
between the two tarballs:

[429]anarcat@curie:~1$ diffoscope zendesk-email-loop.tgz zendesk-email-loop2.tgz
 |################################################################################|  100%                             Time: 0:00:00 
--- zendesk-email-loop.tgz
+++ zendesk-email-loop2.tgz
├── metadata
│ @@ -1 +1 @@
│ -gzip compressed data, last modified: Mon Mar 19 13:21:40 2018, from Unix
│ +gzip compressed data, last modified: Mon Mar 19 17:38:29 2018, from Unix
│   --- zendesk-email-loop.tgz-content
├── +++ zendesk-email-loop2.tgz-content
├── file list
│ │ @@ -1,5 +1,5 @@
│ │ -drwx------   0 anarcat   (1000) anarcat   (1000)        0 2018-03-19 13:11:45.000000 gitlab/cur/
│ │ --rw-------   0 anarcat   (1000) anarcat   (1000)     8858 2018-03-14 00:27:37.000000 gitlab/cur/1521465105.R3423354954039434325.curie:2,FS
│ │ --rw-------   0 anarcat   (1000) anarcat   (1000)    11861 2018-03-19 08:50:10.000000 gitlab/cur/1521464914.R16228666356894086807.curie:2,F
│ │ +drwx------   0 anarcat   (1000) anarcat   (1000)        0 2018-03-19 17:35:32.000000 gitlab/cur/
│ │ +-rw-------   0 anarcat   (1000) anarcat   (1000)     8858 2018-03-14 00:27:37.000000 gitlab/cur/1521463753.R9368947314807690338.curie:2,FS
│ │ +-rw-------   0 anarcat   (1000) anarcat   (1000)     8720 2018-03-14 00:30:59.000000 gitlab/cur/1521463752.R13151765805797588408.curie:2,FS
│ │  drwx------   0 anarcat   (1000) anarcat   (1000)        0 2018-03-19 12:49:12.000000 gitlab/new/
│ │ -drwx------   0 anarcat   (1000) anarcat   (1000)        0 2018-03-19 13:11:45.000000 gitlab/tmp/
│ │ +drwx------   0 anarcat   (1000) anarcat   (1000)        0 2018-03-19 12:49:13.000000 gitlab/tmp/
│   --- gitlab/cur/1521465105.R3423354954039434325.curie:2,FS
├── +++ gitlab/cur/1521463753.R9368947314807690338.curie:2,FS

ie. the files are identical, but the serial numbers, timestamps and
flags differ. Maybe this makes the directory ordering (so the load order
in notmuch new) differ? No idea.

But hopefully this will allow you to reproduce more reliably.

A.

-- 
La seule excuse de Dieu, c'est qu'il n'existe pas.
                        - Stendhal

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: bug: "no top level messages" crash on Zen email loops
  2018-03-19 17:50   ` Antoine Beaupré
@ 2018-03-19 17:56     ` Antoine Beaupré
  2018-03-19 19:25       ` tip: how to not forget attachments Antoine Beaupré
  2018-03-19 20:03     ` bug: "no top level messages" crash on Zen email loops David Bremner
  1 sibling, 1 reply; 22+ messages in thread
From: Antoine Beaupré @ 2018-03-19 17:56 UTC (permalink / raw)
  To: David Bremner, notmuch

[-- Attachment #1: Type: text/plain, Size: 49 bytes --]

And obviously I forget the frigging attachment.


[-- Attachment #2: zendesk-email-loop2.tgz --]
[-- Type: application/x-gtar-compressed, Size: 5368 bytes --]

[-- Attachment #3: Type: text/plain, Size: 137 bytes --]


PS: don't we have a "you forgot to actually attach the damn file" plugin
when we detect the word "attachment" and there's no attach? :p

^ permalink raw reply	[flat|nested] 22+ messages in thread

* tip: how to not forget attachments
  2018-03-19 17:56     ` Antoine Beaupré
@ 2018-03-19 19:25       ` Antoine Beaupré
  2018-03-19 19:57         ` Brian Sniffen
  0 siblings, 1 reply; 22+ messages in thread
From: Antoine Beaupré @ 2018-03-19 19:25 UTC (permalink / raw)
  To: notmuch

[-- Attachment #1: Type: text/plain, Size: 1191 bytes --]

On 2018-03-19 13:56:54, Antoine Beaupré wrote:
> PS: don't we have a "you forgot to actually attach the damn file" plugin
> when we detect the word "attachment" and there's no attach? :p

So I figured that one out, I think. Before adding it to the wiki, I'd
like a review of the code (attached) from more adept elisp programmers.

I'm particularly surprised that save-excursion doesn't work the way I
expect: when I answer "no" to the question, I go back to the email
buffer, but the point is invariably at the end of the buffer, whereas I
would expect it to be where it was when I send the message. It looks
like something else moves the mark before my hook, but I'm not sure
what...

How else than (error) or (keyboard-quit) am I supposed to abort email
sending? (message-send) uses the latter but it would seem better to use
an actual error message than to just "beep" our way out here..

Other advice? (save-excursion) + (goto-char (point-min)) +
(re-search-forward), is that idiomatic? or is there something more
clever that should be done?

thanks!

-- 
Tu connaîtras la vérité de ton chemin à ce qui te rend heureux.
                        - Aristote

[-- Attachment #2: notmuch-buddha.el --]
[-- Type: application/emacs-lisp, Size: 770 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: tip: how to not forget attachments
  2018-03-19 19:25       ` tip: how to not forget attachments Antoine Beaupré
@ 2018-03-19 19:57         ` Brian Sniffen
  2018-03-19 20:16           ` Antoine Beaupré
  0 siblings, 1 reply; 22+ messages in thread
From: Brian Sniffen @ 2018-03-19 19:57 UTC (permalink / raw)
  To: Antoine Beaupré; +Cc: notmuch

[-- Attachment #1: Type: text/plain, Size: 1768 bytes --]

`error` doesn’t do any unwinding; it leaves the program state wherever it was for analysis.  You probably want throw/catch, as described at https://www.gnu.org/software/emacs/manual/html_node/elisp/Catch-and-Throw.html#Catch-and-Throw

-- 
Brian Sniffen

> On Mar 19, 2018, at 3:25 PM, Antoine Beaupré <anarcat@orangeseeds.org> wrote:
> 
>> On 2018-03-19 13:56:54, Antoine Beaupré wrote:
>> PS: don't we have a "you forgot to actually attach the damn file" plugin
>> when we detect the word "attachment" and there's no attach? :p
> 
> So I figured that one out, I think. Before adding it to the wiki, I'd
> like a review of the code (attached) from more adept elisp programmers.
> 
> I'm particularly surprised that save-excursion doesn't work the way I
> expect: when I answer "no" to the question, I go back to the email
> buffer, but the point is invariably at the end of the buffer, whereas I
> would expect it to be where it was when I send the message. It looks
> like something else moves the mark before my hook, but I'm not sure
> what...
> 
> How else than (error) or (keyboard-quit) am I supposed to abort email
> sending? (message-send) uses the latter but it would seem better to use
> an actual error message than to just "beep" our way out here..
> 
> Other advice? (save-excursion) + (goto-char (point-min)) +
> (re-search-forward), is that idiomatic? or is there something more
> clever that should be done?
> 
> thanks!
> 
> -- 
> Tu connaîtras la vérité de ton chemin à ce qui te rend heureux.
>                        - Aristote
> <notmuch-buddha.el>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> https://notmuchmail.org/mailman/listinfo/notmuch

[-- Attachment #2: Type: text/html, Size: 2970 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: bug: "no top level messages" crash on Zen email loops
  2018-03-19 17:50   ` Antoine Beaupré
  2018-03-19 17:56     ` Antoine Beaupré
@ 2018-03-19 20:03     ` David Bremner
  2018-03-29  3:17       ` Olly Betts
  1 sibling, 1 reply; 22+ messages in thread
From: David Bremner @ 2018-03-19 20:03 UTC (permalink / raw)
  To: Antoine Beaupré, notmuch; +Cc: xapian-discuss

Antoine Beaupré <anarcat@orangeseeds.org> writes:

> On 2018-03-19 13:36:49, David Bremner wrote:
>>
>> I can't duplicate that part.  
>
> That's very strange. I can reproduce this on my workstation here, but
> taking the tarball I sent in the original message, I can't reproduce
> anymore. So something changed! I suspect it's the "flags" on the
> message. I have "F" everywhere because I'm experimenting with syncing
> (badly) my inbox tag everywhere, through the flagged tag. All post-new
> hooks stuff that shouldn't affect this because it's in a new
> environment, but it does change the flag on the files sometimes.
>
> So attached is a *new* reproducer, with which I *can* reproduce in a
> clean VM with notmuch from stretch (0.23?).

I can confirm this reproduces both the xapian-check and the notmuch-show
error. Olly agrees that whatever notmuch is doing wrong, it shouldn't
lead to a corrupted database (unless we reach around the API and access
files directly, which I don't think we do).

d

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: tip: how to not forget attachments
  2018-03-19 19:57         ` Brian Sniffen
@ 2018-03-19 20:16           ` Antoine Beaupré
  2018-03-19 21:40             ` Brian Sniffen
  0 siblings, 1 reply; 22+ messages in thread
From: Antoine Beaupré @ 2018-03-19 20:16 UTC (permalink / raw)
  To: Brian Sniffen; +Cc: notmuch

On 2018-03-19 15:57:05, Brian Sniffen wrote:
> `error` doesn’t do any unwinding; it leaves the program state wherever it was for analysis.  You probably want throw/catch, as described at https://www.gnu.org/software/emacs/manual/html_node/elisp/Catch-and-Throw.html#Catch-and-Throw

Wait, but what tag would I throw? message-send doesn't do any catching
around the hook calls...

a.

-- 
The United States is a nation of laws:
badly written and randomly enforced.
                        - Frank Zappa

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: tip: how to not forget attachments
  2018-03-19 20:16           ` Antoine Beaupré
@ 2018-03-19 21:40             ` Brian Sniffen
  2018-03-19 21:47               ` Antoine Beaupré
  0 siblings, 1 reply; 22+ messages in thread
From: Brian Sniffen @ 2018-03-19 21:40 UTC (permalink / raw)
  To: Antoine Beaupré; +Cc: notmuch

Throw your function name, catch it outside the save-excursion, and raise an error there?

-- 
Brian Sniffen

> On Mar 19, 2018, at 4:16 PM, Antoine Beaupré <anarcat@orangeseeds.org> wrote:
> 
>> On 2018-03-19 15:57:05, Brian Sniffen wrote:
>> `error` doesn’t do any unwinding; it leaves the program state wherever it was for analysis.  You probably want throw/catch, as described at https://www.gnu.org/software/emacs/manual/html_node/elisp/Catch-and-Throw.html#Catch-and-Throw
> 
> Wait, but what tag would I throw? message-send doesn't do any catching
> around the hook calls...
> 
> a.
> 
> -- 
> The United States is a nation of laws:
> badly written and randomly enforced.
>                        - Frank Zappa

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: tip: how to not forget attachments
  2018-03-19 21:40             ` Brian Sniffen
@ 2018-03-19 21:47               ` Antoine Beaupré
  0 siblings, 0 replies; 22+ messages in thread
From: Antoine Beaupré @ 2018-03-19 21:47 UTC (permalink / raw)
  To: Brian Sniffen; +Cc: notmuch

On 2018-03-19 17:40:40, Brian Sniffen wrote:
> Throw your function name, catch it outside the save-excursion, and raise an error there?

You mean to catch/throw to have save-excursion save the point correctly?
But my tests show the point is moved by something else in message-send
anyways, so I'm not sure I should even bother at that point...

Or should I?

a.

-- 
Il faut tout un village pour élever un enfant.
                        - Proverbe africain

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/2] test: two new messages for the 'broken' corpus
  2018-03-19 13:25 bug: "no top level messages" crash on Zen email loops Antoine Beaupré
  2018-03-19 16:36 ` David Bremner
@ 2018-03-20 21:22 ` David Bremner
  2018-03-20 21:22   ` [PATCH 2/2] test: add known broken test for indexing an In-Reply-To loop David Bremner
  2018-04-28 13:28 ` bug: "no top level messages" crash on Zen email loops David Bremner
  2 siblings, 1 reply; 22+ messages in thread
From: David Bremner @ 2018-03-20 21:22 UTC (permalink / raw)
  To: Antoine Beaupré, notmuch

These have an 'In-Reply-To' loop, which currently confuses "notmuch
new".

Courtesy of anarcat.
---
 .../1521463752.R13151765805797588408.curie:2,FS    | 173 ++++++++++++++++++++
 .../cur/1521463753.R9368947314807690338.curie:2,FS | 180 +++++++++++++++++++++
 2 files changed, 353 insertions(+)
 create mode 100644 test/corpora/broken/gitlab/cur/1521463752.R13151765805797588408.curie:2,FS
 create mode 100644 test/corpora/broken/gitlab/cur/1521463753.R9368947314807690338.curie:2,FS

diff --git a/test/corpora/broken/gitlab/cur/1521463752.R13151765805797588408.curie:2,FS b/test/corpora/broken/gitlab/cur/1521463752.R13151765805797588408.curie:2,FS
new file mode 100644
index 00000000..0dc4ecf8
--- /dev/null
+++ b/test/corpora/broken/gitlab/cur/1521463752.R13151765805797588408.curie:2,FS
@@ -0,0 +1,173 @@
+Return-Path: <support@gitlab.com>
+X-Original-To: anarcat+gitlab@anarc.at
+Delivered-To: anarcat+gitlab@anarc.at
+Received: from marcos.anarc.at (localhost [127.0.0.1])
+	by delivery.anarc.at (Postfix) with ESMTP id 99EA510E050
+	for <anarcat+gitlab@anarc.at>; Tue, 13 Mar 2018 20:30:59 -0400 (EDT)
+X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on marcos.anarc.at
+X-Spam-Level: 
+X-Spam-Status: No, score=-2.8 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,
+	DKIM_VALID,HTML_FONT_LOW_CONTRAST,HTML_MESSAGE,RCVD_IN_DNSWL_MED
+	autolearn=ham autolearn_force=no version=3.4.1
+Received: from deferred1.pod6.iad1.zdsys.com (deferred1.pod6.iad1.zdsys.com [192.161.153.116])
+	by mx.anarc.at (Postfix) with ESMTPS id 83B8F10E04F
+	for <anarcat+gitlab@anarc.at>; Tue, 13 Mar 2018 20:30:59 -0400 (EDT)
+Received: from out4.pod6.iad1.zdsys.com (out4.pod6.iad1.zdsys.com [192.161.153.103])
+	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
+	(No client certificate requested)
+	by deferred1.pod6.iad1.zdsys.com (Postfix) with ESMTPS id D31684091244
+	for <anarcat+gitlab@anarc.at>; Wed, 14 Mar 2018 00:21:40 +0000 (UTC)
+Received: from out4.pod6.iad1.zdsys.com (localhost.localdomain [127.0.0.1])
+	by out4.pod6.iad1.zdsys.com (Postfix) with ESMTP id CEBCE21CBE26
+	for <anarcat+gitlab@anarc.at>; Wed, 14 Mar 2018 00:21:39 +0000 (UTC)
+Received: from zendesk.com (work11.pod6.iad1.zdsys.com [10.112.38.50])
+	by out4.pod6.iad1.zdsys.com (Postfix) with ESMTP id AFEB621CBE2F
+	for <anarcat+gitlab@anarc.at>; Wed, 14 Mar 2018 00:21:39 +0000 (UTC)
+Date: Wed, 14 Mar 2018 00:21:39 +0000
+From: GitLab Support <support@gitlab.com>
+Reply-To: GitLab Support <support@gitlab.com>
+To: Anarcat+gitlab <anarcat+gitlab@anarc.at>
+Message-ID: <9379QM5Z39_5aa86b1350504_174eb3fc97a2cb98c71674_sprut@zendesk.com>
+In-Reply-To: <9379QM5Z39@zendesk.com>
+ <9379QM5Z39_5aa86b134fcfb_174033fc97a2cb98c7198d_sprut@zendesk.com>
+Subject: comments not showing up?
+Mime-Version: 1.0
+Content-Type: multipart/alternative;
+ boundary="--==_mimepart_5aa86b13a739d_174eb3fc97a2cb98c71811";
+ charset=utf-8
+Content-Transfer-Encoding: 7bit
+X-Auto-Response-Suppress: All
+Auto-Submitted: auto-generated
+X-Mailer: Zendesk Mailer
+X-Delivery-Context: event-id-485367902188
+X-Zendesk-From-Account-Id: 2cbf2e0
+DKIM-Signature:  v=1; a=rsa-sha256; c=relaxed/relaxed; d=zendesk.com;
+ q=dns/txt; s=zendesk1; t=1520986899;
+ bh=ef6zoCM91gZVUja0Us+jy2UVcPK+sNhZocvn333kfzE=;
+ h=date:from:reply-to:to:message-id:in-reply-to:subject:mime-version:content-type:content-transfer-encoding;
+ b=tvPyIz3Cw61n5u2siOiyRiEMAKeDmEu2DMg1Ss534+0PPvbTgruWrWbZklJzy56RDIPi4hoK+Ui6gz0/ih6TyQXG6tpFMeZ4xI49Gqypu1Q2Xo1Uvu6WPYDe8n7D2BJ/8wP6+uqZ+DpAa7ldNi2opHVvmd6GKCuL0fN8lWvdDm4=
+DKIM-Signature:  v=1; a=rsa-sha256; c=relaxed/relaxed;
+ d=gmailmarkup.zendesk.com; q=dns/txt; s=zendesk1; t=1520986899;
+ bh=ef6zoCM91gZVUja0Us+jy2UVcPK+sNhZocvn333kfzE=;
+ h=date:from:reply-to:to:message-id:in-reply-to:subject:mime-version:content-type:content-transfer-encoding;
+ b=sqLp5VpKfrylgT2N7zbweDs3dccEXM44wokM/rxnZ49p9/wYDJNMbffB8yXXZa1BJ0KRfl/UFqoP8YZPYr72a+E291Ug+zq12UJi5MW2VnwMPJxAp+X9hQe90AzNecBDjOUn95qiCKnvVjhtT/LVePm9BbNh8UwC5W3qh/qFjVk=
+X-CMAE-Score: 0
+X-CMAE-Analysis: v=2.3 cv=RMOd4bq+ c=1 sm=1 tr=0
+	a=PnllObM1nxnwd3s59rm/oQ==:117 a=KiCxJD0x+Pe5VASQKmYoJrcyuOo=:19
+	a=IkcTkHD0fZMA:10 a=ZZnuYtJkoWoA:10 a=p0WdMEafAAAA:8
+	a=mvNKaxnlhnHt-7JDRIsA:9 a=QEXdDO2ut3YA:10 a=iCiO2nKxGhgA:10
+	a=AxTCswu-AAAA:8 a=A-Ay9Xv3AAAA:8 a=TeuQqM9sAAAA:8 a=SSmOFEACAAAA:8
+	a=9TiCOqs2P4ysbtUUcEZcEYoys8A=:19 a=DNiwoKe6oyfyDjaX:21 a=_W_S_7VecoQA:10
+	a=frz4AuCg-hUA:10 a=grImUnVaQDeKGm2TToIA:22
+X-AV-Checked: Zendesk using ClamAV - CLEAN
+X-CMAE-Score: 0
+X-CMAE-Analysis: v=2.3 cv=RMOd4bq+ c=1 sm=1 tr=0
+	a=WkljmVdYkabdwxfqvArNOQ==:117 a=KiCxJD0x+Pe5VASQKmYoJrcyuOo=:19
+	a=IkcTkHD0fZMA:10 a=v2DPQv5-lfwA:10 a=ZZnuYtJkoWoA:10 a=p0WdMEafAAAA:8
+	a=mvNKaxnlhnHt-7JDRIsA:9 a=QEXdDO2ut3YA:10 a=iCiO2nKxGhgA:10
+	a=AxTCswu-AAAA:8 a=A-Ay9Xv3AAAA:8 a=TeuQqM9sAAAA:8 a=SSmOFEACAAAA:8
+	a=9TiCOqs2P4ysbtUUcEZcEYoys8A=:19 a=DNiwoKe6oyfyDjaX:21 a=_W_S_7VecoQA:10
+	a=frz4AuCg-hUA:10 a=grImUnVaQDeKGm2TToIA:22
+Content-Length: 4613
+
+
+----==_mimepart_5aa86b13a739d_174eb3fc97a2cb98c71811
+Content-Type: text/plain;
+ charset=utf-8
+Content-Transfer-Encoding: quoted-printable
+
+##- Please type your reply above this line -##
+
+----------------------------------------------
+
+Anarcat+gitlab, Mar 13, 20:21 EDT
+
+in [this issue](https://gitlab.com/anarcat/wallabako/issues/15), there do=
+esn't seem to be any comments. yet you can see there is more than one par=
+ticipant on the sidebar, and, in the [issue listing](https://gitlab.com/)=
+ it actually says the issue has 6 comments.
+
+where did those comments go?
+
+--------------------------------
+This email is a service from GitLab, Inc..
+
+
+
+
+
+
+
+
+
+[9379QM-5Z39]=
+
+----==_mimepart_5aa86b13a739d_174eb3fc97a2cb98c71811
+Content-Type: text/html;
+ charset=utf-8
+Content-Transfer-Encoding: quoted-printable
+
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://ww=
+w.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+<html>
+<head>
+  <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dutf-8=
+" />
+  <style type=3D"text/css">
+    table td {
+      border-collapse: collapse;
+    }
+  </style>
+</head>
+<body style=3D"width: 100%!important; margin: 0; padding: 0;">
+  <div style=3D"padding: 10px ; line-height: 18px; font-family: 'Lucida G=
+rande',Verdana,Arial,sans-serif; font-size: 12px; color:#444444;">
+    <div style=3D"color: #b5b5b5;">##- Please type your reply above this =
+line -##</div>
+    <div style=3D"margin-top: 25px" data-version=3D"2"><table width=3D"10=
+0%" cellpadding=3D"0" cellspacing=3D"0" border=3D"0">  <tr>    <td width=3D=
+"100%" style=3D"padding: 15px 0; border-top: 1px dotted #c5c5c5;">      <=
+table width=3D"100%" cellpadding=3D"0" cellspacing=3D"0" border=3D"0" sty=
+le=3D"table-layout:fixed;">        <tr>                      <td valign=3D=
+"top" style=3D"padding: 0 15px 0 15px; width: 40px;">              <img a=
+lt=3D"Anarcat+gi" height=3D"40" src=3D"https://secure.gravatar.com/avatar=
+/35b1eb000c098a531e6f3cc6886fe2d2?size=3D40&amp;default=3Dhttps%3A%2F%2Fa=
+ssets.zendesk.com%2Fimages%2F2016%2Fdefault-avatar-80.png&amp;r=3Dg" styl=
+e=3D"height: auto; line-height: 100%; outline: none; text-decoration: non=
+e; -webkit-border-radius: 5px; -moz-border-radius: 5px; border-radius: 5p=
+x;" width=3D"40" />            </td>                    <td width=3D"100%=
+" style=3D"padding: 0; margin: 0;" valign=3D"top">            <p style=3D=
+"font-family: 'Lucida Grande','Lucida Sans Unicode','Lucida Sans',Verdana=
+,Tahoma,sans-serif; font-size: 15px; line-height: 18px; margin-bottom: 0;=
+ margin-top: 0; padding: 0; color:#1b1d1e;">                             =
+ <strong>Anarcat+gitlab</strong>                          </p>           =
+ <p style=3D"font-family: 'Lucida Grande','Lucida Sans Unicode','Lucida S=
+ans',Verdana,Tahoma,sans-serif; font-size: 13px; line-height: 25px; margi=
+n-bottom: 15px; margin-top: 0; padding: 0; color:#bbbbbb;">              =
+Mar 13, 20:21 EDT            </p>                                    <div=
+ class=3D"zd-comment" style=3D"color: #2b2e2f; font-family: 'Lucida Sans =
+Unicode', 'Lucida Grande', 'Tahoma', Verdana, sans-serif; font-size: 14px=
+; line-height: 22px; margin: 15px 0"><p style=3D"color: #2b2e2f; font-fam=
+ily: 'Lucida Sans Unicode', 'Lucida Grande', 'Tahoma', Verdana, sans-seri=
+f; font-size: 14px; line-height: 22px; margin: 15px 0" dir=3D"auto">in [t=
+his issue](<a href=3D"https://gitlab.com/anarcat/wallabako/issues/15" rel=
+=3D"nofollow noreferrer" target=3D"_blank">https://gitlab.com/anarcat/wal=
+labako/issues/15</a>), there doesn't seem to be any comments. yet you can=
+ see there is more than one participant on the sidebar, and, in the [issu=
+e listing](<a href=3D"https://gitlab.com/" rel=3D"nofollow noreferrer" ta=
+rget=3D"_blank">https://gitlab.com/</a>) it actually says the issue has 6=
+ comments.</p><p style=3D"color: #2b2e2f; font-family: 'Lucida Sans Unico=
+de', 'Lucida Grande', 'Tahoma', Verdana, sans-serif; font-size: 14px; lin=
+e-height: 22px; margin: 15px 0" dir=3D"auto">where did those comments go?=
+</p></div>                                  </td>        </tr>      </tab=
+le>    </td>  </tr></table></div>
+  </div>
+<span style=3D'color:#FFFFFF'>[9379QM-5Z39]</span></body>
+</html>
+<div itemscope itemtype=3D"http://schema.org/EmailMessage" style=3D"displ=
+ay:none">  <div itemprop=3D"action" itemscope itemtype=3D"http://schema.o=
+rg/ViewAction">    <link itemprop=3D"url" href=3D"https://support.gitlab.=
+com/hc/requests/92542" />    <meta itemprop=3D"name" content=3D"View tick=
+et"/>  </div></div>=
+
+----==_mimepart_5aa86b13a739d_174eb3fc97a2cb98c71811--
diff --git a/test/corpora/broken/gitlab/cur/1521463753.R9368947314807690338.curie:2,FS b/test/corpora/broken/gitlab/cur/1521463753.R9368947314807690338.curie:2,FS
new file mode 100644
index 00000000..4e50a7a0
--- /dev/null
+++ b/test/corpora/broken/gitlab/cur/1521463753.R9368947314807690338.curie:2,FS
@@ -0,0 +1,180 @@
+Return-Path: <support@gitlab.com>
+X-Original-To: anarcat+gitlab@anarc.at
+Delivered-To: anarcat+gitlab@anarc.at
+Received: from marcos.anarc.at (localhost [127.0.0.1])
+	by delivery.anarc.at (Postfix) with ESMTP id 14A5310E050
+	for <anarcat+gitlab@anarc.at>; Tue, 13 Mar 2018 20:27:37 -0400 (EDT)
+X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on marcos.anarc.at
+X-Spam-Level: 
+X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_50,DKIM_SIGNED,
+	DKIM_VALID,HTML_FONT_LOW_CONTRAST,HTML_MESSAGE,RCVD_IN_DNSWL_MED
+	autolearn=ham autolearn_force=no version=3.4.1
+X-Greylist: delayed 356 seconds by postgrey-1.36 at marcos; Tue, 13 Mar 2018 20:27:36 EDT
+Received: from deferred4.pod6.iad1.zdsys.com (deferred4.pod6.iad1.zdsys.com [192.161.153.119])
+	by mx.anarc.at (Postfix) with ESMTPS id F22C310E04F
+	for <anarcat+gitlab@anarc.at>; Tue, 13 Mar 2018 20:27:36 -0400 (EDT)
+Received: from out1.pod6.iad1.zdsys.com (out1.pod6.iad1.zdsys.com [192.161.153.100])
+	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
+	(No client certificate requested)
+	by deferred4.pod6.iad1.zdsys.com (Postfix) with ESMTPS id C448461A2A01
+	for <anarcat+gitlab@anarc.at>; Wed, 14 Mar 2018 00:21:40 +0000 (UTC)
+Received: from out1.pod6.iad1.zdsys.com (localhost.localdomain [127.0.0.1])
+	by out1.pod6.iad1.zdsys.com (Postfix) with ESMTP id BACEC2027796
+	for <anarcat+gitlab@anarc.at>; Wed, 14 Mar 2018 00:21:39 +0000 (UTC)
+Received: from zendesk.com (work11.pod6.iad1.zdsys.com [10.112.38.50])
+	by out1.pod6.iad1.zdsys.com (Postfix) with ESMTP id 9BB1720002E5
+	for <anarcat+gitlab@anarc.at>; Wed, 14 Mar 2018 00:21:39 +0000 (UTC)
+Date: Wed, 14 Mar 2018 00:21:39 +0000
+From: GitLab Support <support@gitlab.com>
+Reply-To: GitLab Support <support@gitlab.com>
+To: Anarcat+gitlab <anarcat+gitlab@anarc.at>
+Message-ID: <9379QM5Z39_5aa86b134fcfb_174033fc97a2cb98c7198d_sprut@zendesk.com>
+In-Reply-To: <9379QM5Z39@zendesk.com>
+ <9379QM5Z39_5aa86b1350504_174eb3fc97a2cb98c71674_sprut@zendesk.com>
+Subject: Your GitLab support request has been received
+Mime-Version: 1.0
+Content-Type: multipart/alternative;
+ boundary="--==_mimepart_5aa86b1391a39_174033fc97a2cb98c72164";
+ charset=utf-8
+Content-Transfer-Encoding: 7bit
+X-Auto-Response-Suppress: All
+Auto-Submitted: auto-generated
+X-Mailer: Zendesk Mailer
+X-Delivery-Context: event-id-485367902228
+X-Zendesk-From-Account-Id: 2cbf2e0
+DKIM-Signature:  v=1; a=rsa-sha256; c=relaxed/relaxed; d=zendesk.com;
+ q=dns/txt; s=zendesk1; t=1520986899;
+ bh=3+2uGgphLJeV0hFB5SjWTxUaMpTILdwsqdcMyqiNU5g=;
+ h=date:from:reply-to:to:message-id:in-reply-to:subject:mime-version:content-type:content-transfer-encoding;
+ b=LcMwrW9N89j0zoa6+NevVVKPVyx5k5o4jvJlenwPQKPDF7i8M8Jpf+Olx+VFna5eEkV0xlFtLFdgYrGdZ6kVOSjBOjW58a1rxs3Xdgn300VG0dVx9dH//CdAg7sb3f7EMIPF4nBE7ororf+yvceDIY2XIdDHTyJiRk629RX5Q+Q=
+DKIM-Signature:  v=1; a=rsa-sha256; c=relaxed/relaxed;
+ d=gmailmarkup.zendesk.com; q=dns/txt; s=zendesk1; t=1520986899;
+ bh=3+2uGgphLJeV0hFB5SjWTxUaMpTILdwsqdcMyqiNU5g=;
+ h=date:from:reply-to:to:message-id:in-reply-to:subject:mime-version:content-type:content-transfer-encoding;
+ b=KA/qHvi70gJ/IjsrgG+NWB4BJ9i+QUTYCk8hSPFfG/AHb1dXldrezjyGgy13g85VGrtQmzLdj2bpFpH3gHDuKY9nvbNMepj8WhogeapUsuaqYlQxHtX2HnvBbsbOv5xTYz+uVlQBkFuvRUim8P64eFvpXdsk6eqXZCWPUoDmRBI=
+X-CMAE-Score: 0
+X-CMAE-Analysis: v=2.3 cv=DJShHRFb c=1 sm=1 tr=0
+	a=PnllObM1nxnwd3s59rm/oQ==:117 a=KiCxJD0x+Pe5VASQKmYoJrcyuOo=:19
+	a=IkcTkHD0fZMA:10 a=ZZnuYtJkoWoA:10 a=p0WdMEafAAAA:8 a=A-Ay9Xv3AAAA:8
+	a=zAaIuZ-tlQE1c5dlMioA:9 a=QEXdDO2ut3YA:10 a=TeuQqM9sAAAA:8
+	a=SSmOFEACAAAA:8 a=9TiCOqs2P4ysbtUUcEZcEYoys8A=:19 a=5vMuza9J3cUQAzhx:21
+	a=_W_S_7VecoQA:10 a=frz4AuCg-hUA:10
+X-AV-Checked: Zendesk using ClamAV - CLEAN
+X-CMAE-Score: 0
+X-CMAE-Analysis: v=2.3 cv=DJShHRFb c=1 sm=1 tr=0
+	a=WkljmVdYkabdwxfqvArNOQ==:117 a=KiCxJD0x+Pe5VASQKmYoJrcyuOo=:19
+	a=IkcTkHD0fZMA:10 a=v2DPQv5-lfwA:10 a=ZZnuYtJkoWoA:10 a=p0WdMEafAAAA:8
+	a=A-Ay9Xv3AAAA:8 a=zAaIuZ-tlQE1c5dlMioA:9 a=QEXdDO2ut3YA:10
+	a=TeuQqM9sAAAA:8 a=SSmOFEACAAAA:8 a=9TiCOqs2P4ysbtUUcEZcEYoys8A=:19
+	a=5vMuza9J3cUQAzhx:21 a=_W_S_7VecoQA:10 a=frz4AuCg-hUA:10
+Content-Length: 4768
+
+
+----==_mimepart_5aa86b1391a39_174033fc97a2cb98c72164
+Content-Type: text/plain;
+ charset=utf-8
+Content-Transfer-Encoding: quoted-printable
+
+##- Please type your reply above this line -##
+
+Hi Anarcat+gitlab,
+
+Thank you for contacting GitLab Support. You can reply to this email at a=
+ny time to add new information to the ticket. While you wait, you may fin=
+d helpful information in our documentation at https://docs.gitlab.com. =
+
+
+You opened this ticket by sending us an email. You can also create ticket=
+s via our web form at https://support.gitlab.com/hc/en-us/requests/new. T=
+he form is tailored to the product you're using and helps ensure our supp=
+ort team has all the details necessary to understand your problem. =
+
+
+We look forward to helping you resolve your request shortly!
+
+Best regards,
+The GitLab team
+
+Did you know? You can keep track of all of your tickets and their current=
+ status using our support web interface! Visit https://support.gitlab.com=
+ and sign in. You can also go directly to details about this ticket at ht=
+tps://support.gitlab.com/hc/requests/92542. We recommend using the suppor=
+t web interface for a superior experience managing your tickets. Email co=
+mments can be difficult to follow.
+
+Don't know your support account password? By emailing us, an account was =
+pre-created for you but you will need to reset your password first. Reque=
+st a new password at https://gitlab.zendesk.com/auth/v2/login/password_re=
+set. Follow the instructions in the password reset email to gain access t=
+o your support account. Now you will be able to see all of your tickets!
+
+--------------------------------
+This email is a service from GitLab, Inc..
+
+
+
+
+
+
+
+
+
+[9379QM-5Z39]=
+
+----==_mimepart_5aa86b1391a39_174033fc97a2cb98c72164
+Content-Type: text/html;
+ charset=utf-8
+Content-Transfer-Encoding: quoted-printable
+
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://ww=
+w.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+<html>
+<head>
+  <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dutf-8=
+" />
+  <style type=3D"text/css">
+    table td {
+      border-collapse: collapse;
+    }
+  </style>
+</head>
+<body style=3D"width: 100%!important; margin: 0; padding: 0;">
+  <div style=3D"padding: 10px ; line-height: 18px; font-family: 'Lucida G=
+rande',Verdana,Arial,sans-serif; font-size: 12px; color:#444444;">
+    <div style=3D"color: #b5b5b5;">##- Please type your reply above this =
+line -##</div>
+    <p>Hi Anarcat+gitlab,</p><p>Thank you for contacting GitLab Support. =
+You can reply to this email at any time to add new information to the tic=
+ket. While you wait, you may find helpful information in our documentatio=
+n at <a href=3D"https://docs.gitlab.com" rel=3D"noreferrer">https://docs.=
+gitlab.com</a>. </p><p>You opened this ticket by sending us an email. You=
+ can also create tickets via our web form at <a href=3D"https://support.g=
+itlab.com/hc/en-us/requests/new" rel=3D"noreferrer">https://support.gitla=
+b.com/hc/en-us/requests/new</a>. The form is tailored to the product you'=
+re using and helps ensure our support team has all the details necessary =
+to understand your problem. </p><p>We look forward to helping you resolve=
+ your request shortly!</p><p>Best regards,<br />The GitLab team</p><p>Did=
+ you know? You can keep track of all of your tickets and their current st=
+atus using our support web interface! Visit <a href=3D"https://support.gi=
+tlab.com" rel=3D"noreferrer">https://support.gitlab.com</a> and sign in. =
+You can also go directly to details about this ticket at <a href=3D"https=
+://support.gitlab.com/hc/requests/92542" rel=3D"noreferrer">https://suppo=
+rt.gitlab.com/hc/requests/92542</a>. We recommend using the support web i=
+nterface for a superior experience managing your tickets. Email comments =
+can be difficult to follow.</p><p>Don't know your support account passwor=
+d? By emailing us, an account was pre-created for you but you will need t=
+o reset your password first. Request a new password at <a href=3D"https:/=
+/gitlab.zendesk.com/auth/v2/login/password_reset" rel=3D"noreferrer">http=
+s://gitlab.zendesk.com/auth/v2/login/password_reset</a>. Follow the instr=
+uctions in the password reset email to gain access to your support accoun=
+t. Now you will be able to see all of your tickets!</p>
+  </div>
+<span style=3D'color:#FFFFFF'>[9379QM-5Z39]</span></body>
+</html>
+<div itemscope itemtype=3D"http://schema.org/EmailMessage" style=3D"displ=
+ay:none">  <div itemprop=3D"action" itemscope itemtype=3D"http://schema.o=
+rg/ViewAction">    <link itemprop=3D"url" href=3D"https://support.gitlab.=
+com/hc/requests/92542" />    <meta itemprop=3D"name" content=3D"View tick=
+et"/>  </div></div>=
+
+----==_mimepart_5aa86b1391a39_174033fc97a2cb98c72164--
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/2] test: add known broken test for indexing an In-Reply-To loop.
  2018-03-20 21:22 ` [PATCH 1/2] test: two new messages for the 'broken' corpus David Bremner
@ 2018-03-20 21:22   ` David Bremner
  2018-03-20 22:09     ` Tomi Ollila
  2018-04-02 11:03     ` [PATCH] WIP: test patch for reference loop problem David Bremner
  0 siblings, 2 replies; 22+ messages in thread
From: David Bremner @ 2018-03-20 21:22 UTC (permalink / raw)
  To: Antoine Beaupré, notmuch

This documents the bug discussed in

     id:87d10042pu.fsf@curie.anarc.at
---
 test/T050-new.sh | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/test/T050-new.sh b/test/T050-new.sh
index cd522364..c55a2d97 100755
--- a/test/T050-new.sh
+++ b/test/T050-new.sh
@@ -354,4 +354,9 @@ exit status: 75
 EOF
 test_expect_equal_file EXPECTED OUTPUT
 
+add_email_corpus broken
+test_begin_subtest "reference loop"
+test_subtest_known_broken
+test_expect_code 0 "notmuch show --format=json id:9379QM5Z39_5aa86b134fcfb_174033fc97a2cb98c7198d_sprut@zendesk.com > OUTPUT"
+
 test_done
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] test: add known broken test for indexing an In-Reply-To loop.
  2018-03-20 21:22   ` [PATCH 2/2] test: add known broken test for indexing an In-Reply-To loop David Bremner
@ 2018-03-20 22:09     ` Tomi Ollila
  2018-03-21  1:34       ` David Bremner
  2018-04-02 11:03     ` [PATCH] WIP: test patch for reference loop problem David Bremner
  1 sibling, 1 reply; 22+ messages in thread
From: Tomi Ollila @ 2018-03-20 22:09 UTC (permalink / raw)
  To: David Bremner, Antoine Beaupré, notmuch

On Tue, Mar 20 2018, David Bremner wrote:

> This documents the bug discussed in
>
>      id:87d10042pu.fsf@curie.anarc.at

do we need the full messages or just minimal part that makes the bug
appear -- or is such butchering considered inappropriate... ?

> ---
>  test/T050-new.sh | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/test/T050-new.sh b/test/T050-new.sh
> index cd522364..c55a2d97 100755
> --- a/test/T050-new.sh
> +++ b/test/T050-new.sh
> @@ -354,4 +354,9 @@ exit status: 75
>  EOF
>  test_expect_equal_file EXPECTED OUTPUT
>  
> +add_email_corpus broken
> +test_begin_subtest "reference loop"
> +test_subtest_known_broken
> +test_expect_code 0 "notmuch show --format=json id:9379QM5Z39_5aa86b134fcfb_174033fc97a2cb98c7198d_sprut@zendesk.com > OUTPUT"
> +
>  test_done
> -- 
> 2.11.0
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> https://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] test: add known broken test for indexing an In-Reply-To loop.
  2018-03-20 22:09     ` Tomi Ollila
@ 2018-03-21  1:34       ` David Bremner
  0 siblings, 0 replies; 22+ messages in thread
From: David Bremner @ 2018-03-21  1:34 UTC (permalink / raw)
  To: Tomi Ollila, Antoine Beaupré, notmuch

Tomi Ollila <tomi.ollila@iki.fi> writes:

> On Tue, Mar 20 2018, David Bremner wrote:
>
>> This documents the bug discussed in
>>
>>      id:87d10042pu.fsf@curie.anarc.at
>
> do we need the full messages or just minimal part that makes the bug
> appear -- or is such butchering considered inappropriate... ?
>

We probably don't strictly need the full messages, I'm just short on
time at the moment.

d

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: bug: "no top level messages" crash on Zen email loops
  2018-03-19 20:03     ` bug: "no top level messages" crash on Zen email loops David Bremner
@ 2018-03-29  3:17       ` Olly Betts
  2018-03-29 12:50         ` Antoine Beaupré
  0 siblings, 1 reply; 22+ messages in thread
From: Olly Betts @ 2018-03-29  3:17 UTC (permalink / raw)
  To: David Bremner; +Cc: Antoine Beaupré, notmuch, xapian-discuss

On Mon, Mar 19, 2018 at 05:03:21PM -0300, David Bremner wrote:
> I can confirm this reproduces both the xapian-check and the notmuch-show
> error. Olly agrees that whatever notmuch is doing wrong, it shouldn't
> lead to a corrupted database

There was a Xapian bug here, which I fixed on master last week and will
be fixed in 1.4.6.

If changes to a new database which didn't modify the termlist table were
committed, then a disk block which had been allocated to be the root
block in the termlist table was leaked (not used but not on the
freelist of blocks the table can recycle).  This was largely harmless,
except that it was detected by Database::check() and caused an error.

Cheers,
    Olly

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: bug: "no top level messages" crash on Zen email loops
  2018-03-29  3:17       ` Olly Betts
@ 2018-03-29 12:50         ` Antoine Beaupré
  2018-03-29 16:31           ` David Bremner
  2018-03-30  4:35           ` Olly Betts
  0 siblings, 2 replies; 22+ messages in thread
From: Antoine Beaupré @ 2018-03-29 12:50 UTC (permalink / raw)
  To: Olly Betts, David Bremner; +Cc: notmuch, xapian-discuss

On 2018-03-29 04:17:21, Olly Betts wrote:
> On Mon, Mar 19, 2018 at 05:03:21PM -0300, David Bremner wrote:
>> I can confirm this reproduces both the xapian-check and the notmuch-show
>> error. Olly agrees that whatever notmuch is doing wrong, it shouldn't
>> lead to a corrupted database
>
> There was a Xapian bug here, which I fixed on master last week and will
> be fixed in 1.4.6.

An honor. It's not every day you find a bug in a database software. ;)

> If changes to a new database which didn't modify the termlist table were
> committed, then a disk block which had been allocated to be the root
> block in the termlist table was leaked (not used but not on the
> freelist of blocks the table can recycle).  This was largely harmless,
> except that it was detected by Database::check() and caused an error.

Hmm... but if I understand correctly, that's one part of the story: I
could get that error and not have the problem with `notmuch show`. Does
that *also* resolve the issue with email loops?

A.

-- 
Travail, du latin Tri Palium trois pieux, instrument de torture.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: bug: "no top level messages" crash on Zen email loops
  2018-03-29 12:50         ` Antoine Beaupré
@ 2018-03-29 16:31           ` David Bremner
  2018-03-30  4:35           ` Olly Betts
  1 sibling, 0 replies; 22+ messages in thread
From: David Bremner @ 2018-03-29 16:31 UTC (permalink / raw)
  To: Antoine Beaupré, Olly Betts; +Cc: notmuch, xapian-discuss

Antoine Beaupré <anarcat@orangeseeds.org> writes:

> Hmm... but if I understand correctly, that's one part of the story: I
> could get that error and not have the problem with `notmuch show`. Does
> that *also* resolve the issue with email loops?

I don't think so, no.

d

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: bug: "no top level messages" crash on Zen email loops
  2018-03-29 12:50         ` Antoine Beaupré
  2018-03-29 16:31           ` David Bremner
@ 2018-03-30  4:35           ` Olly Betts
  1 sibling, 0 replies; 22+ messages in thread
From: Olly Betts @ 2018-03-30  4:35 UTC (permalink / raw)
  To: Antoine Beaupré; +Cc: David Bremner, notmuch, xapian-discuss

On Thu, Mar 29, 2018 at 08:50:22AM -0400, Antoine Beaupré wrote:
> On 2018-03-29 04:17:21, Olly Betts wrote:
> > If changes to a new database which didn't modify the termlist table were
> > committed, then a disk block which had been allocated to be the root
> > block in the termlist table was leaked (not used but not on the
> > freelist of blocks the table can recycle).  This was largely harmless,
> > except that it was detected by Database::check() and caused an error.
> 
> Hmm... but if I understand correctly, that's one part of the story: I
> could get that error and not have the problem with `notmuch show`. Does
> that *also* resolve the issue with email loops?

Yes, from what bremner said on IRC there's still a notmuch bug here.

My reply was really just in the context of Xapian to note what the bug
actually was and when the fix would appear (since bremner sent his
message to both the notmuch and Xapian lists).

Cheers,
    Olly

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH] WIP: test patch for reference loop problem
  2018-03-20 21:22   ` [PATCH 2/2] test: add known broken test for indexing an In-Reply-To loop David Bremner
  2018-03-20 22:09     ` Tomi Ollila
@ 2018-04-02 11:03     ` David Bremner
  2018-04-13  0:10       ` Antoine Beaupré
  1 sibling, 1 reply; 22+ messages in thread
From: David Bremner @ 2018-04-02 11:03 UTC (permalink / raw)
  To: David Bremner, Antoine Beaupré, notmuch

---
 lib/thread.cc | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/lib/thread.cc b/lib/thread.cc
index 3561b27f..356d63ce 100644
--- a/lib/thread.cc
+++ b/lib/thread.cc
@@ -391,10 +391,15 @@ static void
 _resolve_thread_relationships (notmuch_thread_t *thread)
 {
     notmuch_message_node_t *node;
-    notmuch_message_t *message, *parent;
+    notmuch_message_t *message, *first_message = NULL, *parent;
     const char *in_reply_to;
 
-    for (node = thread->message_list->head; node; node = node->next) {
+    node = thread->message_list->head;
+    if (node) {
+	first_message = node->message;
+	node = node->next;
+    }
+    for (; node; node = node->next) {
 	message = node->message;
 	in_reply_to = _notmuch_message_get_in_reply_to (message);
 	if (in_reply_to && strlen (in_reply_to) &&
@@ -406,6 +411,19 @@ _resolve_thread_relationships (notmuch_thread_t *thread)
 	    _notmuch_message_list_add_message (thread->toplevel_list, message);
     }
 
+    /* XXX: this is probably nonsense: if we didn't find any top level
+     * messages, choose one at random */
+    if (first_message) {
+	in_reply_to = _notmuch_message_get_in_reply_to (first_message);
+	if (thread->toplevel_list->head && in_reply_to && strlen (in_reply_to) &&
+	    g_hash_table_lookup_extended (thread->message_hash,
+					  in_reply_to, NULL,
+					  (void **) &parent))
+	    _notmuch_message_add_reply (parent, first_message);
+	else
+	    _notmuch_message_list_add_message (thread->toplevel_list, first_message);
+    }
+
     /* XXX: After scanning through the entire list looking for parents
      * via "In-Reply-To", we should do a second pass that looks at the
      * list of messages IDs in the "References" header instead. (And
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] WIP: test patch for reference loop problem
  2018-04-02 11:03     ` [PATCH] WIP: test patch for reference loop problem David Bremner
@ 2018-04-13  0:10       ` Antoine Beaupré
  2018-04-13 11:17         ` David Bremner
  0 siblings, 1 reply; 22+ messages in thread
From: Antoine Beaupré @ 2018-04-13  0:10 UTC (permalink / raw)
  To: David Bremner, David Bremner, notmuch

Hi!

So I've tried the patch and it seems to fix the bug. I'll run with a
patch version for a while to see if anything's off, but so far so good
I'd say.

Furthermore, it's not possible for me to reproduce the bug in my regular
mailbox anymore. I suspect this is because new mail came in and the file
order in the directories changed, so the bug isn't triggered anymore.

I was able to trigger the bug with the reproducer with an older build of
the code though, so don't worry about that part. :)

Let me know if you need anything else from me before this gets merged.

Cheers!

A.

-- 
Only in the darkness can you see the stars.
                        - Martin Luther King, Jr.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] WIP: test patch for reference loop problem
  2018-04-13  0:10       ` Antoine Beaupré
@ 2018-04-13 11:17         ` David Bremner
  0 siblings, 0 replies; 22+ messages in thread
From: David Bremner @ 2018-04-13 11:17 UTC (permalink / raw)
  To: Antoine Beaupré, notmuch; +Cc: Tomi Ollila

Antoine Beaupré <anarcat@orangeseeds.org> writes:

> Hi!
>
> So I've tried the patch and it seems to fix the bug. I'll run with a
> patch version for a while to see if anything's off, but so far so good
> I'd say.
>
> Furthermore, it's not possible for me to reproduce the bug in my regular
> mailbox anymore. I suspect this is because new mail came in and the file
> order in the directories changed, so the bug isn't triggered anymore.
>
> I was able to trigger the bug with the reproducer with an older build of
> the code though, so don't worry about that part. :)

Thanks for testing!

> Let me know if you need anything else from me before this gets merged.
>

There was also a test patch, basically adding your reproducer to the
test suite. It would be good to know if that test still reproduces the
problem, before the fix is applied.  Tomi mentioned a more reduced test
set. That could reduce the privacy loss a bit, but as far as file size
these messages are pretty small, so I'm not sure if it's worth the
trouble/risk of breaking the reproducer.

I want to refactor the code a bit and hopefully cut down on the
copy-pasta, so I will probably ask you check a second version.

s

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: bug: "no top level messages" crash on Zen email loops
  2018-03-19 13:25 bug: "no top level messages" crash on Zen email loops Antoine Beaupré
  2018-03-19 16:36 ` David Bremner
  2018-03-20 21:22 ` [PATCH 1/2] test: two new messages for the 'broken' corpus David Bremner
@ 2018-04-28 13:28 ` David Bremner
  2 siblings, 0 replies; 22+ messages in thread
From: David Bremner @ 2018-04-28 13:28 UTC (permalink / raw)
  To: Antoine Beaupré, notmuch

Antoine Beaupré <anarcat@orangeseeds.org> writes:

> Hi!
>
> Here's a fun bug for you Xapian tricksters.
>
> Two emails attached make notmuch crash when trying to display the
> folder.
>
> $ notmuch show thread:0000000000000001
> Internal error: Thread 0000000000000001 has no toplevel messages.
>  (notmuch-show.c:1012)

this bug should be fixed in notmuch 0.26.2

d

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-04-28 13:28 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-19 13:25 bug: "no top level messages" crash on Zen email loops Antoine Beaupré
2018-03-19 16:36 ` David Bremner
2018-03-19 17:50   ` Antoine Beaupré
2018-03-19 17:56     ` Antoine Beaupré
2018-03-19 19:25       ` tip: how to not forget attachments Antoine Beaupré
2018-03-19 19:57         ` Brian Sniffen
2018-03-19 20:16           ` Antoine Beaupré
2018-03-19 21:40             ` Brian Sniffen
2018-03-19 21:47               ` Antoine Beaupré
2018-03-19 20:03     ` bug: "no top level messages" crash on Zen email loops David Bremner
2018-03-29  3:17       ` Olly Betts
2018-03-29 12:50         ` Antoine Beaupré
2018-03-29 16:31           ` David Bremner
2018-03-30  4:35           ` Olly Betts
2018-03-20 21:22 ` [PATCH 1/2] test: two new messages for the 'broken' corpus David Bremner
2018-03-20 21:22   ` [PATCH 2/2] test: add known broken test for indexing an In-Reply-To loop David Bremner
2018-03-20 22:09     ` Tomi Ollila
2018-03-21  1:34       ` David Bremner
2018-04-02 11:03     ` [PATCH] WIP: test patch for reference loop problem David Bremner
2018-04-13  0:10       ` Antoine Beaupré
2018-04-13 11:17         ` David Bremner
2018-04-28 13:28 ` bug: "no top level messages" crash on Zen email loops David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).