From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 953806DE2154 for ; Wed, 1 Mar 2017 03:34:48 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: -0.005 X-Spam-Level: X-Spam-Status: No, score=-0.005 tagged_above=-999 required=5 tests=[AWL=0.006, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ST7mfEdfNgJP for ; Wed, 1 Mar 2017 03:34:44 -0800 (PST) Received: from fethera.tethera.net (fethera.tethera.net [198.245.60.197]) by arlo.cworth.org (Postfix) with ESMTPS id 3159F6DE1F7E for ; Wed, 1 Mar 2017 03:34:44 -0800 (PST) Received: from remotemail by fethera.tethera.net with local (Exim 4.84_2) (envelope-from ) id 1cj2WA-0005R3-2o; Wed, 01 Mar 2017 06:34:02 -0500 Received: (nullmailer pid 22212 invoked by uid 1000); Wed, 01 Mar 2017 11:34:39 -0000 From: David Bremner To: Olaf TNSB , notmuch@notmuchmail.org Subject: Re: Add (extracted) attachment text to the search index? In-Reply-To: References: Date: Wed, 01 Mar 2017 07:34:39 -0400 Message-ID: <87inntut68.fsf@tethera.net> MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Mar 2017 11:34:48 -0000 Olaf TNSB writes: > HI, > > I was wondering if it was possible to add the text extracted from an > attachment to the search index? > > For the moment let's leave aside the important issues like - security, > buffer overflows, clients having to install > doc2text/pandoc/pdftotext/whatever... > > > I *think* I'm trying to ask - How can I take a lump of text (e.g. from an > attachment) and associate it with a message ID so I can then search for it? > > Is this a notmuch command, or a Xapien command? This would require some modifications of notmuch. Either modifying lib/index.cc to add the terms at indexing (notmuch new/insert) time, or providing some way of adding the terms later. The former actually sounds simpler to me. d