From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 37CE61F45A for ; Mon, 20 Apr 2020 09:20:59 +0000 (UTC) Date: Mon, 20 Apr 2020 09:20:59 +0000 From: Eric Wong To: meta@public-inbox.org Subject: message/rfc822 and other message/* attachments Message-ID: <20200420092059.GA465@dcvr> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline List-Id: message/rfc822 is an attachment where another email message is attached as-is, and that attached RFC822 message may have attachments of its own (just like normal MIME containers). Email::MIME doesn't descend into them (but gmime and notmuch do). I started working on them manually in msg_iter(), but some things need to be considered: * current URLs to existing message/* attachments MUST NOT break * Message-IDs, From, To, Cc, Subject, not-quoted, quoted, diff-specific stuff in attached messages should be indexed and searchable from Xapian, at least * Date: searchability... probably not? We already fail at odd messages having multiple Date headers... * References and threading could be tricky. Can probably be ignored in attachments, though References/In-Reply-To could probably be indexed via Xapian... * content-based checksumming + deduplication could be subverted due to version mismatches, but it may not matter in practice. ...