unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: david@tethera.net
To: notmuch@notmuchmail.org
Cc: David Bremner <bremner@debian.org>
Subject: [PATCH 2/3] perf-test: cache unpacked corpus
Date: Mon,  3 Dec 2012 21:17:03 -0400	[thread overview]
Message-ID: <1354583824-10520-2-git-send-email-david@tethera.net> (raw)
In-Reply-To: <1354583824-10520-1-git-send-email-david@tethera.net>

From: David Bremner <bremner@debian.org>

Unpacking is not really the expensive step (compared to the initial
notmuch new), but this is a pre-requisite to caching the database.
---
 performance-test/.gitignore       |    1 +
 performance-test/Makefile.local   |    2 +-
 performance-test/perf-test-lib.sh |   51 +++++++++++++++++++++----------------
 3 files changed, 31 insertions(+), 23 deletions(-)

diff --git a/performance-test/.gitignore b/performance-test/.gitignore
index 53f2697..7e20f7c 100644
--- a/performance-test/.gitignore
+++ b/performance-test/.gitignore
@@ -1 +1,2 @@
 tmp.*/
+corpus.mail.*/
diff --git a/performance-test/Makefile.local b/performance-test/Makefile.local
index 5d2acbd..eb713d0 100644
--- a/performance-test/Makefile.local
+++ b/performance-test/Makefile.local
@@ -29,4 +29,4 @@ $(TXZFILE):
 download-corpus:
 	wget -O ${TXZFILE} ${DEFAULT_URL}
 
-CLEAN := $(CLEAN) $(dir)/tmp.*
+CLEAN := $(CLEAN) $(dir)/tmp.* $(dir)/corpus.mail.*
diff --git a/performance-test/perf-test-lib.sh b/performance-test/perf-test-lib.sh
index bba793d..9fbf874 100644
--- a/performance-test/perf-test-lib.sh
+++ b/performance-test/perf-test-lib.sh
@@ -35,37 +35,44 @@ then
 	exit 1
 fi
 
+CORPUS_DIR=${TEST_DIRECTORY}/corpus.mail.$corpus_size
 add_email_corpus ()
 {
     rm -rf ${MAIL_DIR}
+    if [ ! -d $CORPUS_DIR ]; then
+	case "$corpus_size" in
+	    small)
+		arg="mail/enron/bailey-s"
+		;;
+	    medium)
+		arg="mail/notmuch-archive"
+		;;
+	    *)
+		arg=mail
+	esac
 
-    case "$1" in
-	--small)
-	    arg="mail/enron/bailey-s"
-	    ;;
-	--medium)
-	    arg="mail/notmuch-archive"
-	    ;;
-	*)
-	    arg=mail
-    esac
+	if command -v pixz > /dev/null; then
+	    XZ=pixz
+	else
+	    XZ=xz
+	fi
 
-    if command -v pixz > /dev/null; then
-	XZ=pixz
-    else
-	XZ=xz
-    fi
+	printf "Unpacking corpus\n"
+	mkdir $CORPUS_DIR
+
+	tar --checkpoint=.5000 --extract --strip-components=2 \
+	    --directory $CORPUS_DIR \
+	    --use-compress-program ${XZ} \
+	    --file ../download/notmuch-email-corpus-${PERFTEST_VERSION}.tar.xz \
+	    notmuch-email-corpus/"$arg"
 
-    printf "Unpacking corpus\n"
-    tar --checkpoint=.5000 --extract --strip-components=1 \
-	--directory ${TMP_DIRECTORY} \
-	--use-compress-program ${XZ} \
-	--file ../download/notmuch-email-corpus-${PERFTEST_VERSION}.tar.xz \
-	notmuch-email-corpus/"$arg"
+	printf "\n"
 
-    printf "\n"
+    fi
+    cp -lr $CORPUS_DIR $MAIL_DIR
 }
 
+
 print_header () {
     printf "[v%4s]               Wall(s)\tUsr(s)\tSys(s)\tRes(K)\tIn(512B)\tOut(512B)\n" \
 	   ${PERFTEST_VERSION}
-- 
1.7.10.4

  reply	other threads:[~2012-12-04  1:17 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-04  1:17 [PATCH 1/3] performance-test: add argument parsing for performance tests david
2012-12-04  1:17 ` david [this message]
2012-12-05  4:55   ` [PATCH 2/3] perf-test: cache unpacked corpus Austin Clements
2012-12-04  1:17 ` [PATCH 3/3] perf-test: add caching of xapian database david
2012-12-04  4:18   ` [PATCH 1/4] perf-test: add corpus size to output, compact I/O stats david
2012-12-04  4:18     ` [PATCH 2/4] perf-test: bump corpus version to 0.3 david
2012-12-04  4:34       ` David Bremner
2012-12-04  4:18     ` [PATCH 3/4] perf-test: unpack tags david
2012-12-05  5:23       ` Austin Clements
2012-12-05 12:23         ` David Bremner
2012-12-04  4:18     ` [PATCH 4/4] perf-test: add nmbug tags to default database david
2012-12-05  5:02   ` [PATCH 3/3] perf-test: add caching of xapian database Austin Clements

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1354583824-10520-2-git-send-email-david@tethera.net \
    --to=david@tethera.net \
    --cc=bremner@debian.org \
    --cc=notmuch@notmuchmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).