From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id EE6446DE0F90 for ; Mon, 1 Jul 2019 09:35:38 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: -1.081 X-Spam-Level: X-Spam-Status: No, score=-1.081 tagged_above=-999 required=5 tests=[AWL=-0.381, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FAKE_REPLY_C=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id T4b7CXyhhAA1 for ; Mon, 1 Jul 2019 09:35:37 -0700 (PDT) Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) by arlo.cworth.org (Postfix) with ESMTPS id 9B3756DE0F64 for ; Mon, 1 Jul 2019 09:35:37 -0700 (PDT) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 8406321FC6; Mon, 1 Jul 2019 12:35:34 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Mon, 01 Jul 2019 12:35:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=U3XD+NnOJ3clBk5ChkjuY+W3TDw4Rx52OR7kmzE47Ng=; b=lgW2xkCe ECaFI1g16onL1QAZsc+yRPNX1FRRmhLWbRKn6g7cSjtcFEPOxJHtz3afMCZidX2d QrWGFVITdZKQb0xexVWlEUWIMfdCSFzwVRYwvRYMbWJd3Y7Ip8idbXzhxPQv1T/4 NwstLZI8SlSE35LD9tAvjjPG+Zg4wD0i1NHFWnu8L2Dnb6Un5Rz5gZypgupYnOc1 ebzb0KWQP1foTR1u+CcLlI9SStbNDTlJ0R+C+49Luhbo32/tw99YqendvQJdIg/a /PSx5FIAVevNqyabfhNXm80B6XjKlXn4slVusF249SXmk5gA0lDlrEyiAcrhS7E1 vEnKOn0H+2L8DA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduvddrvdeigddutddvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvffukfggtggugfgjfgesthekredttderudenucfhrhhomheptehlvhgr rhhoucfjvghrrhgvrhgruceorghlvhhhvghrrhgvsegrlhhvhhdrnhhoqdhiphdrohhrgh eqnecuffhomhgrihhnpehtfihithhtvghrrdgtohhmnecukfhppeduledtrdduvddurddv ledrfeenucfrrghrrghmpehmrghilhhfrhhomheprghlvhhhvghrrhgvsegrlhhvhhdrnh hoqdhiphdrohhrghenucevlhhushhtvghrufhiiigvpedt X-ME-Proxy: Received: from nimloth.alvh.no-ip.org (unknown [190.121.29.3]) by mail.messagingengine.com (Postfix) with ESMTPA id D53728005C; Mon, 1 Jul 2019 12:35:33 -0400 (EDT) Received: by nimloth.alvh.no-ip.org (Postfix, from userid 1000) id 33870120AC7; Mon, 1 Jul 2019 11:36:57 -0400 (-04) Date: Mon, 1 Jul 2019 11:36:57 -0400 From: Alvaro Herrera To: Tomi Ollila Cc: notmuch@notmuchmail.org Subject: Re: notmuch ignoring alot of emails Message-ID: <20190701153657.GA9961@alvherre.pgsql> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Jul 2019 16:35:39 -0000 On 2019-Jun-30, Tomi Ollila wrote: > Just checking line starting with 'From ' would be pretty naïve since > From may be first word in any line in text body. Even so, early mail systems relied on there not being any such lines, and they escaped those lines to be ">From" or to use quoted-printable encoding. GMime has bespoke code to do this, in fact. Mail systems stopped doing this escaping after MIME boundaries got more widely used, I suppose. I think NNTP used content length much more extensively than email. Of course, NNTP is almost disappeared now ... > If we'd have to do content scanning then at least empty line before > From would be reguired, and next lines starting like > Received: someone@not.an.example > Date: a date > From: someone > > (and then empty line... ;) > > all this checkin would be required and still it could fail (perhaps > this content get modified in the fly, but then signature check, if > this mail had one, could fail...) This logic still fails if you have mail-like content in the mail, such as attachments produced by "git format-patch". Many open source lists don't have this problem because they use "git send-email" instead, but this is not universal. > If there is header that tells the length of the body, then things > could be easier... Early emails had Content-Length as a header, but it was not universal, and nowadays it seems to have been abandoned as a practice; the MIME content boundary is used universally (or at least I cannot find any recent divergence from this practice.) -- Álvaro Herrera http://www.twitter.com/alvherre