From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id A3969429E21 for ; Sun, 6 Nov 2011 09:15:41 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VwHFG2isFMwS for ; Sun, 6 Nov 2011 09:15:40 -0800 (PST) Received: from dmz-mailsec-scanner-4.mit.edu (DMZ-MAILSEC-SCANNER-4.MIT.EDU [18.9.25.15]) by olra.theworths.org (Postfix) with ESMTP id C3EFA431FB6 for ; Sun, 6 Nov 2011 09:15:40 -0800 (PST) X-AuditID: 1209190f-b7f6e6d0000008df-59-4eb6c0ba7622 Received: from mailhub-auth-3.mit.edu ( [18.9.21.43]) by dmz-mailsec-scanner-4.mit.edu (Symantec Messaging Gateway) with SMTP id 3F.C6.02271.AB0C6BE4; Sun, 6 Nov 2011 12:15:38 -0500 (EST) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-3.mit.edu (8.13.8/8.9.2) with ESMTP id pA6HFbLg004151; Sun, 6 Nov 2011 12:15:37 -0500 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id pA6HFZWp006196 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Sun, 6 Nov 2011 12:15:36 -0500 (EST) Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.72) (envelope-from ) id 1RN6Lq-0006H1-FQ; Sun, 06 Nov 2011 12:17:46 -0500 From: Austin Clements To: notmuch@notmuchmail.org Subject: [PATCH] Store "from" and "subject" headers in the database. Date: Sun, 6 Nov 2011 12:17:36 -0500 Message-Id: <1320599856-24078-1-git-send-email-amdragon@mit.edu> X-Mailer: git-send-email 1.7.2.3 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrEIsWRmVeSWpSXmKPExsUixCmqrbvrwDY/g23rtSxe7ta2uH5zJrMD k8fnRU8ZPZ6tusUcwBTFZZOSmpNZllqkb5fAlXFx2W/mgm+yFWffcDcwzpToYuTgkBAwkXg7 L62LkRPIFJO4cG89WxcjF4eQwD5GibYLJ5hAEkIC6xklmj4rQSROMEms+fcMqmozo8TcLTNZ QKrYBDQktu1fzghiiwhIS+y8O5sVxGYWsJbYcruDHcQWFnCRuHj9GxuIzSKgKjHl62Wwel4B B4mtj84xQ5yhIPHqxlr2CYy8CxgZVjHKpuRW6eYmZuYUpybrFicn5uWlFuma6OVmluilppRu YgQHhST/DsZvB5UOMQpwMCrx8Gbc2+onxJpYVlyZe4hRkoNJSZTXbv82PyG+pPyUyozE4oz4 otKc1OJDjBIczEoivJHpQDnelMTKqtSifJiUNAeLkjhv4w4HPyGB9MSS1OzU1ILUIpisDAeH kgSvCTD4hQSLUtNTK9Iyc0oQ0kwcnCDDeYCGO4DU8BYXJOYWZ6ZD5E8xKkqJ8wqCJARAEhml eXC9sKh9xSgO9IowLytIFQ8w4uG6XwENZgIa3K4LNrgkESEl1cBYkvC0NvCIoMxL42ez75m1 3iv7eOnqxCXK9ef3XGR6kap3Mci1cI9pfo/JvUdrtgrtOcgfsjC680uE1PPD81QcF7Mvfe1h 8NeudtuqoomKocsYE7XebjrHvkI8Y7am7eVUtlPR0dWliQFLr8i9WXSlr2Pd1RARjXLz2Sez pLosMuvNtSZKPvRSYinOSDTUYi4qTgQAoou0ZrUCAAA= Cc: notmuch@kismala.com X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Nov 2011 17:15:41 -0000 This is a rebase and cleanup of Istvan Marko's patch from id:m3pqnj2j7a.fsf@zsu.kismala.com Search retrieves these headers for every message in the search results. Previously, this required opening and parsing every message file. Storing them directly in the database significantly reduces IO and computation, speeding up search by between 50% and 10X. Taking full advantage of this requires a database rebuild, but it will fall back to the old behavior for messages that do not have headers stored in the database. --- lib/database.cc | 2 +- lib/message.cc | 23 +++++++++++++++++++++-- lib/notmuch-private.h | 11 +++++++---- 3 files changed, 29 insertions(+), 7 deletions(-) diff --git a/lib/database.cc b/lib/database.cc index fa632f8..e4ef14e 100644 --- a/lib/database.cc +++ b/lib/database.cc @@ -1725,7 +1725,7 @@ notmuch_database_add_message (notmuch_database_t *notmuch, goto DONE; date = notmuch_message_file_get_header (message_file, "date"); - _notmuch_message_set_date (message, date); + _notmuch_message_set_header_values (message, date, from, subject); _notmuch_message_index_file (message, filename); } else { diff --git a/lib/message.cc b/lib/message.cc index 8f22e02..ca7fbf2 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -412,6 +412,21 @@ _notmuch_message_ensure_message_file (notmuch_message_t *message) const char * notmuch_message_get_header (notmuch_message_t *message, const char *header) { + std::string value; + + /* Fetch header from the appropriate xapian value field if + * available */ + if (strcasecmp (header, "from") == 0) + value = message->doc.get_value (NOTMUCH_VALUE_FROM); + else if (strcasecmp (header, "subject") == 0) + value = message->doc.get_value (NOTMUCH_VALUE_SUBJECT); + else if (strcasecmp (header, "message-id") == 0) + value = message->doc.get_value (NOTMUCH_VALUE_MESSAGE_ID); + + if (!value.empty()) + return talloc_strdup (message, value.c_str ()); + + /* Otherwise fall back to parsing the file */ _notmuch_message_ensure_message_file (message); if (message->message_file == NULL) return NULL; @@ -795,8 +810,10 @@ notmuch_message_set_author (notmuch_message_t *message, } void -_notmuch_message_set_date (notmuch_message_t *message, - const char *date) +_notmuch_message_set_header_values (notmuch_message_t *message, + const char *date, + const char *from, + const char *subject) { time_t time_value; @@ -809,6 +826,8 @@ _notmuch_message_set_date (notmuch_message_t *message, message->doc.add_value (NOTMUCH_VALUE_TIMESTAMP, Xapian::sortable_serialise (time_value)); + message->doc.add_value (NOTMUCH_VALUE_FROM, from); + message->doc.add_value (NOTMUCH_VALUE_SUBJECT, subject); } /* Synchronize changes made to message->doc out into the database. */ diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h index 0d3cc27..60a932f 100644 --- a/lib/notmuch-private.h +++ b/lib/notmuch-private.h @@ -93,7 +93,9 @@ NOTMUCH_BEGIN_DECLS typedef enum { NOTMUCH_VALUE_TIMESTAMP = 0, - NOTMUCH_VALUE_MESSAGE_ID + NOTMUCH_VALUE_MESSAGE_ID, + NOTMUCH_VALUE_FROM, + NOTMUCH_VALUE_SUBJECT } notmuch_value_t; /* Xapian (with flint backend) complains if we provide a term longer @@ -269,9 +271,10 @@ void _notmuch_message_ensure_thread_id (notmuch_message_t *message); void -_notmuch_message_set_date (notmuch_message_t *message, - const char *date); - +_notmuch_message_set_header_values (notmuch_message_t *message, + const char *date, + const char *from, + const char *subject); void _notmuch_message_sync (notmuch_message_t *message); -- 1.7.2.3