From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 83B766DE1FFF for ; Wed, 1 Mar 2017 09:55:43 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: -2.254 X-Spam-Level: X-Spam-Status: No, score=-2.254 tagged_above=-999 required=5 tests=[AWL=0.047, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zkQh0An2ScB9 for ; Wed, 1 Mar 2017 09:55:39 -0800 (PST) Received: from outgoing-stata.csail.mit.edu (outgoing-stata.csail.mit.edu [128.30.2.210]) by arlo.cworth.org (Postfix) with ESMTP id 92DB96DE1DE8 for ; Wed, 1 Mar 2017 09:55:39 -0800 (PST) Received: from 99-167-85-176.lightspeed.irvnca.sbcglobal.net ([99.167.85.176] helo=localhost) by outgoing-stata.csail.mit.edu with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1cj8TQ-000V4g-QI; Wed, 01 Mar 2017 12:55:37 -0500 From: Steven Allen To: notmuch@notmuchmail.org Cc: David Bremner , Olaf TNSB Subject: Re: Add (extracted) attachment text to the search index? In-Reply-To: <87inntut68.fsf@tethera.net> References: <87inntut68.fsf@tethera.net> Date: Wed, 01 Mar 2017 09:55:31 -0800 Message-ID: <877f48lw4s.fsf@bistromath> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Mar 2017 17:55:43 -0000 --=-=-= Content-Type: text/plain David Bremner writes: > This would require some modifications of notmuch. Either modifying > lib/index.cc to add the terms at indexing (notmuch new/insert) time, or > providing some way of adding the terms later. The former actually sounds > simpler to me. To do this correctly, you'd want to be able to run an external text extraction tool (for PDFs, word documents, etc.) so I think the latter would be better in the long run (it would allow the user to index attachments in the hooks). --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEE2vt3MKJDbR/uMVY0wSXs/RcTlDwFAli3CxMACgkQwSXs/RcT lDyVRg//eqJ3fcMal60YScPPUXqUoRwF1nth/Q8FHfVrRLB69tX/+e0ihXbjb8h/ I0Saqd+YMHAF0WXxfaw2KPXzHZR79GctU1+7LkITbFw+LEKp6Gkvx8oCFTGPhB8a LZ8ayq350qo+p6Ly4cDukW5ucQ0Hasr4VSatyCLVGBBP/W9Gw6wrGUKkVCI0QX9u /zWKSbmRZkptpN7AgOr/cOdtehkkczhg+7rs9amGIBlEo40JiMfpN71dS0fR6qzo 0FJIXLS4QrQu6r7GS20izjJZwNDRQXBnKsxiis29SeGg3wiCj02A8zdsJEvgwOEI 7lMD3kE76m+dhRB1ycKuoeiuLPrAd/vqEMeK26UILf8jeuZUXTMaMboqC1S/63WS EKQxNdN6MlsceUVKTFhtccXBSMkGbAFnWOu1K9m3S7AA5uU4f2oSDZO9YDccU9LG v0eSoYgw421ZvHy1U4r5v9oUQ2c3cFUjMFANXbvxNEk1MNpm1ym3mqEwoCyO1GMh wX6B0j2P1kcjR45wUd5h5IPZqVieMDf9BFVI56z1xei/XZ5eqmPzzRTgKvHYqLra r3uduTZeCclE7LP7NIT3avb5sXnQ5JuLXiIXgAPnLEi9E7cSkGtz8n390VQ1LGB8 zpJmGxI0MglgJLH4BZOzMJ8lZF5O+xj+Ehcxk+qSsr8YP+qcyac= =nTv/ -----END PGP SIGNATURE----- --=-=-=--