unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Dmitrijs Ledkovs <xnox@debian.org>
To: Petri Savolainen <petri@koodaamo.fi>
Cc: notmuch@notmuchmail.org
Subject: Re: How to index arbitrary headers?
Date: Thu, 4 Oct 2012 09:17:59 +0100	[thread overview]
Message-ID: <CANBHLUjqa67roFmQBMuO_5vpmWPD-j7XR7X5jMhc1UDnC3aEHw@mail.gmail.com> (raw)
In-Reply-To: <CACXwgK8Any3Cd+eO21Frz2X8P1kcOkw9XDTN5jj1Nu-DQdHRUg@mail.gmail.com>

On 3 October 2012 19:32, Petri Savolainen <petri@koodaamo.fi> wrote:
> Hi,
>
> thanks for your response. I am evaluating notmuch / xapian for building an
> application for analyzing in various ways a fairly large number of emails
> accumulated over several years. I am afraid the number of headers that would
> ultimately need to be indexed is therefore quite a lot larger than what
> notmuch currently indexes.
>
>  Petri
>
> 2012/10/1 Austin Clements <amdragon@mit.edu>
>>
>> Quoth Petri Savolainen on Oct 01 at  3:39 pm:
>> >    Hello,
>> >    I could not find information anywhere in notmuch docs about what is
>> >    actually indexed - specifically, what email headers are indexed and
>> >    searchable? If a header is not indexed, does searching for its value
>> > still
>> >    result in a search hit?
>> >    It would be nice if one could just provide the list of headers to be
>> >    indexed in some configuration file or something.
>> >    Thanks,
>> >     Petri
>>
>> notmuch doesn't currently implement this, though it is an
>> oft-requested feature.  One (not insurmountable) difficulty is that
>> the database would have to be rebuilt if a user-configured list of
>> headers changed and there are technical limitations that prevent us
>> from simply indexing all headers.  Out of curiosity, what headers are
>> you interested in indexing?
>>
>> The currently indexed headers are described in man
>> notmuch-search-terms.
>

Use mapreduce instead: hadoop or discoproject or haddop with dumbo
should be faster.

Regards,

Dmitrijs.

  reply	other threads:[~2012-10-04  8:18 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-01 12:39 How to index arbitrary headers? Petri Savolainen
2012-10-01 15:43 ` Austin Clements
2012-10-03 18:32   ` Petri Savolainen
2012-10-04  8:17     ` Dmitrijs Ledkovs [this message]
2012-10-04 12:51   ` Nicolás Reynolds
2012-10-04 16:25     ` Dmitrijs Ledkovs
2012-10-04 18:18       ` Nicolás Reynolds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANBHLUjqa67roFmQBMuO_5vpmWPD-j7XR7X5jMhc1UDnC3aEHw@mail.gmail.com \
    --to=xnox@debian.org \
    --cc=notmuch@notmuchmail.org \
    --cc=petri@koodaamo.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).