On Fri, 20 Nov 2009 14:26:25 +0100, Mike Hommey wrote: > I got a segfault when importing my maildir. It happened because of an > old weird email, where the message-id is the following: > Message-ID: <000022b17a1f$00004fbe$00000550@myrop (ew6.southwind.net [216.53.98.70]) by onyx.southwind.net from homepage.com (114.230.197.216) by newmail.spectraweb.ch from default (m202.2-25.warwick.net [ > 218.242.202.80]) by host.warwick.net (8.10.0.Beta10/8.10.0.Beta10) with SMTP id e9GKEKk19201> Thanks for sharing this Mike, (and for sending me the original file). > Anyways, the stack dump is the following: > #0 0x00007ffff6d1e598 in Xapian::Document::add_term(std::string const&, unsigned int) () from /usr/lib/libxapian.so.15 > #1 0x000000000040f5ff in _notmuch_message_add_term (message=0x0, prefix_name=0x41ad7f "tag", value=0x4191b0 "inbox") at lib/message.cc:587 > #2 0x000000000040f827 in notmuch_message_add_tag (message=0x0, tag=0x4191b0 "inbox") at lib/message.cc:668 > #3 0x0000000000407bc8 in tag_inbox_and_unread (message=0x0) at notmuch-new.c:44 > #4 0x0000000000407f63 in add_files_recursive (notmuch=0x62cc20, path=0x832e90 "/home/mh/Maildir/saved-messages/cur", st=0x7fffffffe000, state=0x7fffffffe240) at notmuch-new.c:185 > #5 0x0000000000408036 in add_files_recursive (notmuch=0x62cc20, path=0x832de0 "/home/mh/Maildir/saved-messages", st=0x7fffffffe000, state=0x7fffffffe240) at notmuch-new.c:223 > #6 0x0000000000408036 in add_files_recursive (notmuch=0x62cc20, path=0x62c920 "/home/mh/Maildir", st=0x7fffffffe000, state=0x7fffffffe240) at notmuch-new.c:223 > #7 0x0000000000408245 in add_files (notmuch=0x62cc20, path=0x62c920 "/home/mh/Maildir", state=0x7fffffffe240) at notmuch-new.c:287 > #8 0x0000000000408704 in notmuch_new_command (ctx=0x61f140, argc=0, argv=0x7fffffffe3e8) at notmuch-new.c:431 > #9 0x0000000000406ea8 in main (argc=2, argv=0x7fffffffe3d8) at notmuch.c:400 I didn't get the same crash when importing the file. But I did get a short document out of it (just a handful of terms indexed) and most significantly, an empty message-ID term. Xapian has a limit on the maximum length of a term, so one thing we'll need to do here is to notice if the message ID exceeds that length and then treat it as a we treat a missing Message-ID header, (that is, generate our own message ID by computing a sha-1 hash over the message). So, there was an obvious bug in the message-ID handling, (the code was still looking for NULL for a missing header, but we now return "" for a missing header instead). I've fixed this. > Now, looking at the code, there seems to me there actually 3 problems: > - _notmuch_message_create_for_message_id can return NULL, and while > there is a test for it in notmuch_database_add_message, the function > still returns a success code Thanks. This is fixed now. > - things are still going on even when message is NULL in > add_files_recursive I didn't replicate this case, but it *should* be fixed now that notmuch_database_add_message is returning a non-success value. > - for some reason, xapian doesn't want to add the document corresponding > to this old spam message: notmuch->xapian_db->add_document throws an > exception. I think things had just gone wrong long before then. > I can provide the spam if necessary, or can continue debugging the issue > with some guidance. Thanks for providing it. It turns out that the giant Message-Id value wasn't causing the problem. Instead the message was corrupt by having a stray new line at the third line. (So GMime is seeing only the first two lines of headers). We *used* to have working code to detect this kind of file as "not an email" but again, this broke when we changed notmuch_message_get_header to return "" instead of NULL for missing headers. See patches below (just pushed now as well) for the fixes. -Carl