From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 0882A431FAF for ; Sun, 25 Nov 2012 10:05:29 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xEH+0VN3TK5i for ; Sun, 25 Nov 2012 10:05:28 -0800 (PST) Received: from dmz-mailsec-scanner-4.mit.edu (DMZ-MAILSEC-SCANNER-4.MIT.EDU [18.9.25.15]) by olra.theworths.org (Postfix) with ESMTP id 6C2F7431FAE for ; Sun, 25 Nov 2012 10:05:28 -0800 (PST) X-AuditID: 1209190f-b7f636d00000095b-b6-50b25de779a5 Received: from mailhub-auth-2.mit.edu ( [18.7.62.36]) by dmz-mailsec-scanner-4.mit.edu (Symantec Messaging Gateway) with SMTP id 13.5A.02395.7ED52B05; Sun, 25 Nov 2012 13:05:27 -0500 (EST) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-2.mit.edu (8.13.8/8.9.2) with ESMTP id qAPI5QrZ000959; Sun, 25 Nov 2012 13:05:27 -0500 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id qAPI5OJs007864 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Sun, 25 Nov 2012 13:05:26 -0500 (EST) Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.80) (envelope-from ) id 1Tcga4-0003M8-Np; Sun, 25 Nov 2012 13:05:24 -0500 Date: Sun, 25 Nov 2012 13:05:24 -0500 From: Austin Clements To: Tomi Ollila Subject: Re: [PATCH 3/3] lib: Reject multi-message mboxes and deprecate single-message mbox Message-ID: <20121125180524.GL4562@mit.edu> References: <1353824161-31717-1-git-send-email-amdragon@mit.edu> <1353824161-31717-3-git-send-email-amdragon@mit.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrBKsWRmVeSWpSXmKPExsUixG6novs8dlOAwcFzFhbXb85ktnizch6r A5PH4a8LWTyerbrFHMAUxWWTkpqTWZZapG+XwJXxccd0loIrQhXrtjcxNjDe5uti5OSQEDCR 6HzXwAZhi0lcuLceyObiEBLYxyixce0ERghnA6PEivv/oTIXmSTWnL0AlVnCKNGx+BojSD+L gKrE36nvmEFsNgENiW37l4PFRQRUJB60rWcFsZkFpCW+/W5mArGFBWIl/j9eDGbzCmhLtBzd zAIxdC6jxLaZL1kgEoISJ2c+YYFo1pHYufUO0BkcYIOW/+OACMtLNG+dzQwS5hQwkDjzIgQk LAq0dsrJbWwTGIVnIRk0C8mgWQiDZiEZtICRZRWjbEpulW5uYmZOcWqybnFyYl5eapGuiV5u ZoleakrpJkZQJHBK8u9g/HZQ6RCjAAejEg/vjcSNAUKsiWXFlbmHGCU5mJREeUWAcSTEl5Sf UpmRWJwRX1Sak1p8iFGCg1lJhHciE1CONyWxsiq1KB8mJc3BoiTOezXlpr+QQHpiSWp2ampB ahFMVoaDQ0mCVxRkqGBRanpqRVpmTglCmomDE2Q4D9BwVZAa3uKCxNzizHSI/ClGXY45M9uf MAqx5OXnpUqJ8/KDFAmAFGWU5sHNgSWwV4ziQG8J826OAariASY/uEmvgJYwAS1Jvr4RZElJ IkJKqoFR+FKBi84ejZtdF0v3/3R8vfhw1j0WISO+WVxLns5s8rnK2csblat178BVxtnvLpZ0 BL1Y9EjlSvH13UdFlOpM+Tf2uld19gSfmlSm6mLwI0tb6pLptpV797zVyFE8OmFlV6qx9OGv aYci7c3jI+8cnjLJ6NiVH9pyjwtnTF5g3eix5OiJeTVhSizFGYmGWsxFxYkA5hbVGDsDAAA= Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Nov 2012 18:05:29 -0000 Quoth Tomi Ollila on Nov 25 at 3:26 pm: > On Sun, Nov 25 2012, Austin Clements wrote: > > > Previously, we would treat multi-message mboxes as one giant email, > > which, besides the obvious incorrect indexing, often led to > > out-of-memory errors for archival mboxes. Now we explicitly reject > > multi-message mboxes. For historical reasons, we retain support for > > single-message mboxes, but official deprecate this behavior. > > > The series looks good to me -- but I don't know about deprecating > single-message mboxes: > > * If we someday support (read-only?) mbox format, then single-message > mboxes are "normal" again. If notmuch does gain mbox support, then its handling of single-message mboxes will *definitely* change because it will stop doing maildir-like things to them (flag sync, moving from new to cur, etc), which people may currently be depending on. This was one of the motivations for deprecating the current handling of single-message mboxes. > * Some naïve mb2md scripts could leave the 'From ' -line intact: for > example `formail -bz -s head -3 < $MAIL`(*) can be used to demonstrate this I would call that "buggy", rather than "naïve". ]:--8) > * Some people may have large collection of single-file messages starting > with 'From ' currently indexed. If those are to be re-indexed later > without "single-message mbox" support that is somewhat of a burden to > the users (**) That's why this only deprecates them (with a warning) and doesn't drop support for them. The idea is to keep the historical handling for a few releases and then we'll have the flexibility to do what we want with single-message mboxes (including supporting them as real mbox). It's probably a good idea to include a script or a wiki pointer for fixing single-message mboxes in the NEWS. As long as the file name is kept the same, notmuch won't reindex it. > (*) my "mb2md" wannabe does gnus-like "$formail" -bz -R 'From ' X-From-Line: ... > > (**) Something like the following could be used to mangle "single-file mboxes"... > find . -type f | xargs perl -e 'foreach (@ARGV) { open IO, "+<", $_ or > next; sysread IO, $buf, 5; if ($buf eq "From ") { sysseek IO, 0, 0; > syswrite IO, "Fro:"; }}' > This breaks the multi-message mbox nicely... >;) > > > Tomi