* BUG: "notmuch insert" fails with "Delivery of non-mail file" @ 2019-01-18 16:07 Alvaro Herrera 2019-01-19 18:17 ` David Bremner 2019-03-07 6:57 ` Leo L. Schwab 0 siblings, 2 replies; 9+ messages in thread From: Alvaro Herrera @ 2019-01-18 16:07 UTC (permalink / raw) To: notmuch Hello I've been using notmuch successfully for a couple of years now (mostly via neomutt). Thanks for developing it. Not long ago I switched my mail setup to use notmuch insert via mailfilter instead of good old procmail. However, since then a number of emails are reported by notmuch as "non-mail", and appear to not be indexed. (I use --keep, so they're still in my maildir). In my read of the code ultimately comes from g_mime_parser_construct_message rejecting the message. I reported this to GMime, and they said that the problem is that notmuch insert is using the mbox mode: https://github.com/jstedfast/gmime/issues/58 (Sample email is attached there). As far as I can tell, this is all coming from _notmuch_message_file_parse() which sets the is_mbox flag when it sees the "^From " line at the start of the file ... which kinda makes sense in general terms, but for notmuch-insert I think that's the wrong thing to do. Maybe a solution is to pass a flag down from notmuch-insert.c's add_file all the way down to _notmuch_message_file_parse telling it not to treat the file as an mbox. I *think* that not all of the messages that fail parsing contain an email attachment, so maybe I'll come back with further issues later on. This is the first one I debugged. Thanks -- Álvaro Herrera ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BUG: "notmuch insert" fails with "Delivery of non-mail file" 2019-01-18 16:07 BUG: "notmuch insert" fails with "Delivery of non-mail file" Alvaro Herrera @ 2019-01-19 18:17 ` David Bremner 2019-01-21 19:53 ` Alvaro Herrera 2019-03-07 6:57 ` Leo L. Schwab 1 sibling, 1 reply; 9+ messages in thread From: David Bremner @ 2019-01-19 18:17 UTC (permalink / raw) To: Alvaro Herrera, notmuch Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > In my read of the code ultimately comes from > g_mime_parser_construct_message rejecting the message. > I reported this to GMime, and they said that the problem is that notmuch > insert is using the mbox mode: > https://github.com/jstedfast/gmime/issues/58 > (Sample email is attached there). This issue (or a related one) has come up before https://nmbug.notmuchmail.org/nmweb/search/postfix+mbox Generally it seems to be caused by tools that add mbox 'From ' headers, without actually mbox escaping the file. We haven't yet reached consensus on a good solution (generally people just want to fix their own mail, which is understandable). A workaround discussed in the messages I reference above is to strip the 'From ' header before passing to notmuch-insert. Perhaps some scholar of the RFCs can convince us that that is "always" the right thing for notmuch insert to do. > As far as I can tell, this is all coming from > _notmuch_message_file_parse() which sets the is_mbox flag when it sees > the "^From " line at the start of the file ... which kinda makes sense > in general terms, but for notmuch-insert I think that's the wrong thing > to do. Maybe a solution is to pass a flag down from notmuch-insert.c's > add_file all the way down to _notmuch_message_file_parse telling it not > to treat the file as an mbox. > I'd be worried about letting notmuch-insert deliver messages that notmuch-new would not be able to parse. In particular we'd like to keep the property that a Maildir + the output of notmuch-dump should be enough to completely recover the notmuch database. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BUG: "notmuch insert" fails with "Delivery of non-mail file" 2019-01-19 18:17 ` David Bremner @ 2019-01-21 19:53 ` Alvaro Herrera 2019-02-01 19:33 ` David Bremner 0 siblings, 1 reply; 9+ messages in thread From: Alvaro Herrera @ 2019-01-21 19:53 UTC (permalink / raw) To: David Bremner; +Cc: notmuch Hi David, thanks for replying. On 2019-Jan-19, David Bremner wrote: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > > In my read of the code ultimately comes from > > g_mime_parser_construct_message rejecting the message. > > I reported this to GMime, and they said that the problem is that notmuch > > insert is using the mbox mode: > > https://github.com/jstedfast/gmime/issues/58 > > (Sample email is attached there). > > This issue (or a related one) has come up before > > https://nmbug.notmuchmail.org/nmweb/search/postfix+mbox > > Generally it seems to be caused by tools that add mbox 'From ' headers, > without actually mbox escaping the file. We haven't yet reached > consensus on a good solution (generally people just want to fix their > own mail, which is understandable). A workaround discussed in the > messages I reference above is to strip the 'From ' header before passing > to notmuch-insert. Perhaps some scholar of the RFCs can convince us that > that is "always" the right thing for notmuch insert to do. I'm not sure I follow. As I understand, notmuch does not work with mboxes, only with maildirs, so the behavior of splitting emails at "From " is not strictly necessary, since one file always equals one message. As for RFC scholarship, I spent some time looking at https://tools.ietf.org/html/rfc5322 to see if it defined any sort of message separator ... but as far as I can tell, it only defines what does a valid message looks like. It doesn't say where does one message end. On the other hand, in my world, it's been quite a while since 'From ' was considered a useful message separator. This stopped being true in a pretty extensive way when git-format-patches messages started being posted as attachments. But even before that, MUAs stopped adding the ">" at the start of a "From " line in human-written text. Nowadays what really governs the split is the Content-Length header, from the MIME definitions. Most tools do not escape lines starting with 'From ' anymore. As far as I can tell, this is defined by RFC-2049, https://tools.ietf.org/html/rfc2046#section-5.1.1 which states that the implementation must look for the "boundary delimitir line". Stopping at a "From " line before finding the boundary delimiter line would be a mistake, in my reading. > > As far as I can tell, this is all coming from > > _notmuch_message_file_parse() which sets the is_mbox flag when it sees > > the "^From " line at the start of the file ... which kinda makes sense > > in general terms, but for notmuch-insert I think that's the wrong thing > > to do. Maybe a solution is to pass a flag down from notmuch-insert.c's > > add_file all the way down to _notmuch_message_file_parse telling it not > > to treat the file as an mbox. > > I'd be worried about letting notmuch-insert deliver messages that > notmuch-new would not be able to parse. In particular we'd like to keep > the property that a Maildir + the output of notmuch-dump should be > enough to completely recover the notmuch database. Hmm, that's a good point -- I assume that notmuch-new should be patched similarly so that those messages are valid there too. So maybe the solution (given that, as I said above, Notmuch does not appear to handle mboxes at all) is to just set the mbox flag to false completely ... -- Álvaro Herrera PostgreSQL Expert, https://www.2ndQuadrant.com/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BUG: "notmuch insert" fails with "Delivery of non-mail file" 2019-01-21 19:53 ` Alvaro Herrera @ 2019-02-01 19:33 ` David Bremner 0 siblings, 0 replies; 9+ messages in thread From: David Bremner @ 2019-02-01 19:33 UTC (permalink / raw) To: Alvaro Herrera; +Cc: notmuch Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > I'm not sure I follow. As I understand, notmuch does not work with > mboxes, only with maildirs, so the behavior of splitting emails at "From > " is not strictly necessary, since one file always equals one message. Checking for mboxes was added as a safety feature since people found indexing large mboxes led to bad results (bloated index, crashing indexer, etc...). > On the other hand, in my world, it's been quite a while since 'From ' > was considered a useful message separator. This stopped being true in a > pretty extensive way when git-format-patches messages started being > posted as attachments. Sure. Things on disk should either be mboxes, or not. If they start with 'From ', they are mboxes. We attempted to take away support for single message mboxes, but people complained even more about that. So generally, if tools / users don't want to escape 'From ' after the first line, the first line should not be 'From '. My original question was whether notmuch-insert should strip the 'From ' (and presumbly save as a normal header) before delivery. d ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BUG: "notmuch insert" fails with "Delivery of non-mail file" 2019-01-18 16:07 BUG: "notmuch insert" fails with "Delivery of non-mail file" Alvaro Herrera 2019-01-19 18:17 ` David Bremner @ 2019-03-07 6:57 ` Leo L. Schwab 2019-03-07 21:05 ` David Bremner 1 sibling, 1 reply; 9+ messages in thread From: Leo L. Schwab @ 2019-03-07 6:57 UTC (permalink / raw) To: notmuch On Fri, Jan 18, 2019 at 01:07:35PM -0300, Alvaro Herrera wrote: > Not long ago I switched my mail setup to use notmuch insert via > mailfilter instead of good old procmail. However, since then a number > of emails are reported by notmuch as "non-mail", and appear to not be > indexed. (I use --keep, so they're still in my maildir). > I've been bumping in to the same problem. I converted 20+ years worth of mail to maildir format expressly so I could use notmuch. I almost didn't do it because the setup was so daunting (reconfigure system MTA/MDA to deliver in maildir instead of mbox; install, learn, and set up procmail and/or fetchmail to update the index; modify muttrc; blah blah blah...). And then I hit on the idea of creating a .forward file containing: "|/usr/bin/notmuch insert" Poof! Delivery and indexing in one step. The downside to this is that, if notmuch-insert fails with the above error, the MTA tries to bounce the message (so thanks *very* much for making me aware of the '--keep' option). As a result, I've been thinking how this might be addressed. The thought I've had is to create a new option to motmuch-insert that essentially means, "Skip all validation, just index and deliver." In other words, the input is presumed to have already been validated by an external entity, so assume it's good and index and deliver it. '--keep' effectively does this already, but it quashes *all* errors. I just want to skip the validator. I could probably kluge up a prototype if anyone thinks that's a reasonable idea. Thanks, Schwab ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BUG: "notmuch insert" fails with "Delivery of non-mail file" 2019-03-07 6:57 ` Leo L. Schwab @ 2019-03-07 21:05 ` David Bremner 2019-03-07 22:03 ` Alvaro Herrera 0 siblings, 1 reply; 9+ messages in thread From: David Bremner @ 2019-03-07 21:05 UTC (permalink / raw) To: Leo L. Schwab, notmuch "Leo L. Schwab" <ewhac@ewhac.org> writes: > As a result, I've been thinking how this might be addressed. The > thought I've had is to create a new option to motmuch-insert that > essentially means, "Skip all validation, just index and deliver." In other > words, the input is presumed to have already been validated by an external > entity, so assume it's good and index and deliver it. '--keep' effectively > does this already, but it quashes *all* errors. I just want to skip the > validator. If you move your database out of the way and run notmuch-new, are the messages delivered by your modified notmuch-insert? I think that's a property I'd require for anything we were going to carry upstream. Also, I'm not sure about turning off _all_ validation vs. just not checking for mboxes. d ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BUG: "notmuch insert" fails with "Delivery of non-mail file" 2019-03-07 21:05 ` David Bremner @ 2019-03-07 22:03 ` Alvaro Herrera 2019-03-07 22:34 ` David Bremner 0 siblings, 1 reply; 9+ messages in thread From: Alvaro Herrera @ 2019-03-07 22:03 UTC (permalink / raw) To: David Bremner; +Cc: Leo L. Schwab, notmuch On 2019-Mar-07, David Bremner wrote: > "Leo L. Schwab" <ewhac@ewhac.org> writes: > > > As a result, I've been thinking how this might be addressed. The > > thought I've had is to create a new option to motmuch-insert that > > essentially means, "Skip all validation, just index and deliver." In other > > words, the input is presumed to have already been validated by an external > > entity, so assume it's good and index and deliver it. '--keep' effectively > > does this already, but it quashes *all* errors. I just want to skip the > > validator. > > If you move your database out of the way and run notmuch-new, are the > messages delivered by your modified notmuch-insert? I think that's a > property I'd require for anything we were going to carry upstream. > > Also, I'm not sure about turning off _all_ validation vs. just not > checking for mboxes. By the way, did you not have a problem with message id:878szcwd8c.fsf@swing.csc.kth.se delivered to this very list? That one includes an unescaped "^From " line in the body, which is sure to confuse the message parser ... -- Álvaro Herrera 39°50'S 73°21'W "Ah, spring... when a young penguin's fancy lightly turns to thoughts of ... Beta testing!" (Fedora 9 beta announcement) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BUG: "notmuch insert" fails with "Delivery of non-mail file" 2019-03-07 22:03 ` Alvaro Herrera @ 2019-03-07 22:34 ` David Bremner 2019-03-08 0:51 ` Alvaro Herrera 0 siblings, 1 reply; 9+ messages in thread From: David Bremner @ 2019-03-07 22:34 UTC (permalink / raw) To: Alvaro Herrera; +Cc: Leo L. Schwab, notmuch Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > By the way, did you not have a problem with message > id:878szcwd8c.fsf@swing.csc.kth.se delivered to this very list? That > one includes an unescaped "^From " line in the body, which is sure to > confuse the message parser ... That particular message is base64 encoded, so that would be unlikely to cause a problem. d ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BUG: "notmuch insert" fails with "Delivery of non-mail file" 2019-03-07 22:34 ` David Bremner @ 2019-03-08 0:51 ` Alvaro Herrera 0 siblings, 0 replies; 9+ messages in thread From: Alvaro Herrera @ 2019-03-08 0:51 UTC (permalink / raw) To: David Bremner; +Cc: Leo L. Schwab, notmuch On 2019-Mar-07, David Bremner wrote: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > > By the way, did you not have a problem with message > > id:878szcwd8c.fsf@swing.csc.kth.se delivered to this very list? That > > one includes an unescaped "^From " line in the body, which is sure to > > confuse the message parser ... > > That particular message is base64 encoded, so that would be unlikely to > cause a problem. Bah, you're right, it is. I swear I saw an "!Err" message from maildrop because of an unadorned "From " in a message recently ... can't find it now. -- Álvaro Herrera http://www.linkedin.com/in/alvherre ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2019-03-08 0:52 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-01-18 16:07 BUG: "notmuch insert" fails with "Delivery of non-mail file" Alvaro Herrera 2019-01-19 18:17 ` David Bremner 2019-01-21 19:53 ` Alvaro Herrera 2019-02-01 19:33 ` David Bremner 2019-03-07 6:57 ` Leo L. Schwab 2019-03-07 21:05 ` David Bremner 2019-03-07 22:03 ` Alvaro Herrera 2019-03-07 22:34 ` David Bremner 2019-03-08 0:51 ` Alvaro Herrera
Code repositories for project(s) associated with this public inbox https://yhetil.org/notmuch.git/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).