unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [PATCH 1/3] performance-test: add argument parsing for performance tests.
@ 2012-12-04  1:17 david
  2012-12-04  1:17 ` [PATCH 2/3] perf-test: cache unpacked corpus david
  2012-12-04  1:17 ` [PATCH 3/3] perf-test: add caching of xapian database david
  0 siblings, 2 replies; 12+ messages in thread
From: david @ 2012-12-04  1:17 UTC (permalink / raw)
  To: notmuch; +Cc: David Bremner

From: David Bremner <bremner@debian.org>

This patch just sets (non-exported) variables. The variable $debug is
already used, and $corpus_size will be used in following commits.
---
 performance-test/perf-test-lib.sh |   25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/performance-test/perf-test-lib.sh b/performance-test/perf-test-lib.sh
index 1399d05..bba793d 100644
--- a/performance-test/perf-test-lib.sh
+++ b/performance-test/perf-test-lib.sh
@@ -1,5 +1,30 @@
 . ./version.sh
 
+corpus_size=large
+
+while test "$#" -ne 0
+do
+	case "$1" in
+	-d|--debug)
+		debug=t;
+		shift
+		;;
+	-s|--small)
+		corpus_size=small;
+		shift
+		;;
+	-m|--medium)
+		corpus_size=medium;
+		shift
+		;;
+	-l|--large)
+		corpus_size=large;
+		shift
+		;;
+	*)
+		echo "error: unknown performance test option '$1'" >&2; exit 1 ;;
+	esac
+done
 . ../test/test-lib-common.sh
 
 set -e
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] perf-test: cache unpacked corpus
  2012-12-04  1:17 [PATCH 1/3] performance-test: add argument parsing for performance tests david
@ 2012-12-04  1:17 ` david
  2012-12-05  4:55   ` Austin Clements
  2012-12-04  1:17 ` [PATCH 3/3] perf-test: add caching of xapian database david
  1 sibling, 1 reply; 12+ messages in thread
From: david @ 2012-12-04  1:17 UTC (permalink / raw)
  To: notmuch; +Cc: David Bremner

From: David Bremner <bremner@debian.org>

Unpacking is not really the expensive step (compared to the initial
notmuch new), but this is a pre-requisite to caching the database.
---
 performance-test/.gitignore       |    1 +
 performance-test/Makefile.local   |    2 +-
 performance-test/perf-test-lib.sh |   51 +++++++++++++++++++++----------------
 3 files changed, 31 insertions(+), 23 deletions(-)

diff --git a/performance-test/.gitignore b/performance-test/.gitignore
index 53f2697..7e20f7c 100644
--- a/performance-test/.gitignore
+++ b/performance-test/.gitignore
@@ -1 +1,2 @@
 tmp.*/
+corpus.mail.*/
diff --git a/performance-test/Makefile.local b/performance-test/Makefile.local
index 5d2acbd..eb713d0 100644
--- a/performance-test/Makefile.local
+++ b/performance-test/Makefile.local
@@ -29,4 +29,4 @@ $(TXZFILE):
 download-corpus:
 	wget -O ${TXZFILE} ${DEFAULT_URL}
 
-CLEAN := $(CLEAN) $(dir)/tmp.*
+CLEAN := $(CLEAN) $(dir)/tmp.* $(dir)/corpus.mail.*
diff --git a/performance-test/perf-test-lib.sh b/performance-test/perf-test-lib.sh
index bba793d..9fbf874 100644
--- a/performance-test/perf-test-lib.sh
+++ b/performance-test/perf-test-lib.sh
@@ -35,37 +35,44 @@ then
 	exit 1
 fi
 
+CORPUS_DIR=${TEST_DIRECTORY}/corpus.mail.$corpus_size
 add_email_corpus ()
 {
     rm -rf ${MAIL_DIR}
+    if [ ! -d $CORPUS_DIR ]; then
+	case "$corpus_size" in
+	    small)
+		arg="mail/enron/bailey-s"
+		;;
+	    medium)
+		arg="mail/notmuch-archive"
+		;;
+	    *)
+		arg=mail
+	esac
 
-    case "$1" in
-	--small)
-	    arg="mail/enron/bailey-s"
-	    ;;
-	--medium)
-	    arg="mail/notmuch-archive"
-	    ;;
-	*)
-	    arg=mail
-    esac
+	if command -v pixz > /dev/null; then
+	    XZ=pixz
+	else
+	    XZ=xz
+	fi
 
-    if command -v pixz > /dev/null; then
-	XZ=pixz
-    else
-	XZ=xz
-    fi
+	printf "Unpacking corpus\n"
+	mkdir $CORPUS_DIR
+
+	tar --checkpoint=.5000 --extract --strip-components=2 \
+	    --directory $CORPUS_DIR \
+	    --use-compress-program ${XZ} \
+	    --file ../download/notmuch-email-corpus-${PERFTEST_VERSION}.tar.xz \
+	    notmuch-email-corpus/"$arg"
 
-    printf "Unpacking corpus\n"
-    tar --checkpoint=.5000 --extract --strip-components=1 \
-	--directory ${TMP_DIRECTORY} \
-	--use-compress-program ${XZ} \
-	--file ../download/notmuch-email-corpus-${PERFTEST_VERSION}.tar.xz \
-	notmuch-email-corpus/"$arg"
+	printf "\n"
 
-    printf "\n"
+    fi
+    cp -lr $CORPUS_DIR $MAIL_DIR
 }
 
+
 print_header () {
     printf "[v%4s]               Wall(s)\tUsr(s)\tSys(s)\tRes(K)\tIn(512B)\tOut(512B)\n" \
 	   ${PERFTEST_VERSION}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] perf-test: add caching of xapian database.
  2012-12-04  1:17 [PATCH 1/3] performance-test: add argument parsing for performance tests david
  2012-12-04  1:17 ` [PATCH 2/3] perf-test: cache unpacked corpus david
@ 2012-12-04  1:17 ` david
  2012-12-04  4:18   ` [PATCH 1/4] perf-test: add corpus size to output, compact I/O stats david
  2012-12-05  5:02   ` [PATCH 3/3] perf-test: add caching of xapian database Austin Clements
  1 sibling, 2 replies; 12+ messages in thread
From: david @ 2012-12-04  1:17 UTC (permalink / raw)
  To: notmuch; +Cc: David Bremner

From: David Bremner <bremner@debian.org>

The caching and uncaching seem to be necessarily manual, as timing the
initial notmuch new is one of our goals with this suite.
---
 performance-test/.gitignore       |    1 +
 performance-test/Makefile.local   |    2 +-
 performance-test/basic            |    5 +++++
 performance-test/perf-test-lib.sh |   18 ++++++++++++++++++
 4 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/performance-test/.gitignore b/performance-test/.gitignore
index 7e20f7c..779a115 100644
--- a/performance-test/.gitignore
+++ b/performance-test/.gitignore
@@ -1,2 +1,3 @@
 tmp.*/
 corpus.mail.*/
+notmuch.cache.*/
diff --git a/performance-test/Makefile.local b/performance-test/Makefile.local
index eb713d0..b136a88 100644
--- a/performance-test/Makefile.local
+++ b/performance-test/Makefile.local
@@ -29,4 +29,4 @@ $(TXZFILE):
 download-corpus:
 	wget -O ${TXZFILE} ${DEFAULT_URL}
 
-CLEAN := $(CLEAN) $(dir)/tmp.* $(dir)/corpus.mail.*
+CLEAN := $(CLEAN) $(dir)/tmp.* $(dir)/corpus.mail.* $(dir)/notmuch.cache.*
diff --git a/performance-test/basic b/performance-test/basic
index 9d015ee..41a7ff1 100755
--- a/performance-test/basic
+++ b/performance-test/basic
@@ -2,11 +2,16 @@
 
 . ./perf-test-lib.sh
 
+uncache_database
+
 add_email_corpus
 
 print_header
 
 time_run 'initial notmuch new' 'notmuch new'
+
+cache_database
+
 time_run 'second notmuch new' 'notmuch new'
 time_run 'dump *' 'notmuch dump > tags.out'
 time_run 'restore *' 'notmuch restore < tags.out'
diff --git a/performance-test/perf-test-lib.sh b/performance-test/perf-test-lib.sh
index 9fbf874..c9b131a 100644
--- a/performance-test/perf-test-lib.sh
+++ b/performance-test/perf-test-lib.sh
@@ -36,6 +36,8 @@ then
 fi
 
 CORPUS_DIR=${TEST_DIRECTORY}/corpus.mail.$corpus_size
+DB_CACHE_DIR=${TEST_DIRECTORY}/notmuch.cache.$corpus_size
+
 add_email_corpus ()
 {
     rm -rf ${MAIL_DIR}
@@ -69,9 +71,25 @@ add_email_corpus ()
 	printf "\n"
 
     fi
+
     cp -lr $CORPUS_DIR $MAIL_DIR
+
+    if [ -d $DB_CACHE_DIR ]; then
+	cp -r $DB_CACHE_DIR ${MAIL_DIR}/.notmuch
+    fi
 }
 
+cache_database () {
+    if [ -d $MAIL_DIR/.notmuch ]; then
+	cp -r $MAIL_DIR/.notmuch $DB_CACHE_DIR
+    else
+	echo "Warning: No database found to cache"
+    fi
+}
+
+uncache_database () {
+    rm -rf $DB_CACHE_DIR
+}
 
 print_header () {
     printf "[v%4s]               Wall(s)\tUsr(s)\tSys(s)\tRes(K)\tIn(512B)\tOut(512B)\n" \
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 1/4] perf-test: add corpus size to output, compact I/O stats
  2012-12-04  1:17 ` [PATCH 3/3] perf-test: add caching of xapian database david
@ 2012-12-04  4:18   ` david
  2012-12-04  4:18     ` [PATCH 2/4] perf-test: bump corpus version to 0.3 david
                       ` (2 more replies)
  2012-12-05  5:02   ` [PATCH 3/3] perf-test: add caching of xapian database Austin Clements
  1 sibling, 3 replies; 12+ messages in thread
From: david @ 2012-12-04  4:18 UTC (permalink / raw)
  To: notmuch; +Cc: David Bremner

From: David Bremner <bremner@debian.org>

Austin suggested a while ago that the corpus size be printed in the
header. In the end it seems the corpus will be fixed per test script,
so this suggestion indeed makes sense.

The tabbing was wrapping on my usual 80 column terminal, so I joined
the input and output columns together.
---
 performance-test/perf-test-lib.sh |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/performance-test/perf-test-lib.sh b/performance-test/perf-test-lib.sh
index c9b131a..08e2ebd 100644
--- a/performance-test/perf-test-lib.sh
+++ b/performance-test/perf-test-lib.sh
@@ -92,14 +92,14 @@ uncache_database () {
 }
 
 print_header () {
-    printf "[v%4s]               Wall(s)\tUsr(s)\tSys(s)\tRes(K)\tIn(512B)\tOut(512B)\n" \
-	   ${PERFTEST_VERSION}
+    printf "[v%4s %6s]        Wall(s)\tUsr(s)\tSys(s)\tRes(K)\tIn/Out(512B)\n" \
+	   ${PERFTEST_VERSION} ${corpus_size}
 }
 
 time_run () {
     printf "%-22s" "$1"
     if test "$verbose" != "t"; then exec 4>test.output 3>&4; fi
-    if ! eval >&3 "/usr/bin/time -f '%e\t%U\t%S\t%M\t%I\t%O' $2" ; then
+    if ! eval >&3 "/usr/bin/time -f '%e\t%U\t%S\t%M\t%I/%O' $2" ; then
 	test_failure=$(($test_failure + 1))
     fi
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] perf-test: bump corpus version to 0.3
  2012-12-04  4:18   ` [PATCH 1/4] perf-test: add corpus size to output, compact I/O stats david
@ 2012-12-04  4:18     ` david
  2012-12-04  4:34       ` David Bremner
  2012-12-04  4:18     ` [PATCH 3/4] perf-test: unpack tags david
  2012-12-04  4:18     ` [PATCH 4/4] perf-test: add nmbug tags to default database david
  2 siblings, 1 reply; 12+ messages in thread
From: david @ 2012-12-04  4:18 UTC (permalink / raw)
  To: notmuch; +Cc: David Bremner

From: David Bremner <bremner@debian.org>

The new version ships with some tags, and an updated archive of the
notmuch mailing list.
---
 performance-test/download/notmuch-email-corpus-0.2.tar.xz.asc |    9 ---------
 performance-test/download/notmuch-email-corpus-0.3.tar.xz.asc |    9 +++++++++
 performance-test/version.sh                                   |    2 +-
 3 files changed, 10 insertions(+), 10 deletions(-)
 delete mode 100644 performance-test/download/notmuch-email-corpus-0.2.tar.xz.asc
 create mode 100644 performance-test/download/notmuch-email-corpus-0.3.tar.xz.asc

diff --git a/performance-test/download/notmuch-email-corpus-0.2.tar.xz.asc b/performance-test/download/notmuch-email-corpus-0.2.tar.xz.asc
deleted file mode 100644
index c8b4b3d..0000000
--- a/performance-test/download/notmuch-email-corpus-0.2.tar.xz.asc
+++ /dev/null
@@ -1,9 +0,0 @@
------BEGIN PGP SIGNATURE-----
-Version: GnuPG v1.4.12 (GNU/Linux)
-
-iJwEAAECAAYFAlCsvx0ACgkQTiiN/0Um85kZAwP9GgOQ22jK8mr5X4pT/mB8EjSH
-QbndlxxbRrP0ChTqjBQoD3IsTHjNL7W572BfXb/MNo94R/iIQ7yTHCDVNuwBhvKd
-7qgIuW2FUS1uTfJRP5KBNf8JPuin+6wqGe8/+y/iOs+XJSdiYg1ElS49Ntnpg0yl
-btImgEcxTxQ2qfzDS1g=
-=iuZR
------END PGP SIGNATURE-----
diff --git a/performance-test/download/notmuch-email-corpus-0.3.tar.xz.asc b/performance-test/download/notmuch-email-corpus-0.3.tar.xz.asc
new file mode 100644
index 0000000..f109e81
--- /dev/null
+++ b/performance-test/download/notmuch-email-corpus-0.3.tar.xz.asc
@@ -0,0 +1,9 @@
+-----BEGIN PGP SIGNATURE-----
+Version: GnuPG v1.4.12 (GNU/Linux)
+
+iJwEAAECAAYFAlC9a90ACgkQTiiN/0Um85nAMAP+LCWdKzolcl/KW+JcCd0Dk+9v
+0vvtBVEhBes0TbK6iWrxCV2OIuYG/RhnFlJTZ4MjgaTRxzDubpC+JktaJdLmIQUN
+B7ZIDMjFduCwmtyLiuu/00CjxJKUXm7vx+ULGpvp0uxFE/vaqGP997BHwBjjfBVm
+YX6BlLX1SV6TfENkuRE=
+=Mks5
+-----END PGP SIGNATURE-----
diff --git a/performance-test/version.sh b/performance-test/version.sh
index d9270b1..afafc73 100644
--- a/performance-test/version.sh
+++ b/performance-test/version.sh
@@ -1,3 +1,3 @@
 # this should be both a valid Makefile fragment and valid POSIX(ish) shell.
 
-PERFTEST_VERSION=0.2
+PERFTEST_VERSION=0.3
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] perf-test: unpack tags.
  2012-12-04  4:18   ` [PATCH 1/4] perf-test: add corpus size to output, compact I/O stats david
  2012-12-04  4:18     ` [PATCH 2/4] perf-test: bump corpus version to 0.3 david
@ 2012-12-04  4:18     ` david
  2012-12-05  5:23       ` Austin Clements
  2012-12-04  4:18     ` [PATCH 4/4] perf-test: add nmbug tags to default database david
  2 siblings, 1 reply; 12+ messages in thread
From: david @ 2012-12-04  4:18 UTC (permalink / raw)
  To: notmuch; +Cc: David Bremner

From: David Bremner <bremner@debian.org>

It's a bit annoying to call tar twice, but we cache the results so it
isn't as bad as it could be.
---
 performance-test/Makefile.local   |    1 +
 performance-test/perf-test-lib.sh |   25 +++++++++++++++++++------
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/performance-test/Makefile.local b/performance-test/Makefile.local
index b136a88..cdd7f19 100644
--- a/performance-test/Makefile.local
+++ b/performance-test/Makefile.local
@@ -30,3 +30,4 @@ download-corpus:
 	wget -O ${TXZFILE} ${DEFAULT_URL}
 
 CLEAN := $(CLEAN) $(dir)/tmp.* $(dir)/corpus.mail.* $(dir)/notmuch.cache.*
+CLEAN := $(CLEAN) $(dir)/corpus.tags
diff --git a/performance-test/perf-test-lib.sh b/performance-test/perf-test-lib.sh
index 08e2ebd..40c88c9 100644
--- a/performance-test/perf-test-lib.sh
+++ b/performance-test/perf-test-lib.sh
@@ -41,6 +41,13 @@ DB_CACHE_DIR=${TEST_DIRECTORY}/notmuch.cache.$corpus_size
 add_email_corpus ()
 {
     rm -rf ${MAIL_DIR}
+
+    if command -v pixz > /dev/null; then
+	XZ=pixz
+    else
+	XZ=xz
+    fi
+
     if [ ! -d $CORPUS_DIR ]; then
 	case "$corpus_size" in
 	    small)
@@ -53,12 +60,6 @@ add_email_corpus ()
 		arg=mail
 	esac
 
-	if command -v pixz > /dev/null; then
-	    XZ=pixz
-	else
-	    XZ=xz
-	fi
-
 	printf "Unpacking corpus\n"
 	mkdir $CORPUS_DIR
 
@@ -72,6 +73,18 @@ add_email_corpus ()
 
     fi
 
+    if [ ! -d $TEST_DIRECTORY/corpus.tags ]; then
+
+	mkdir $TEST_DIRECTORY/corpus.tags
+
+	tar --extract --strip-components=2 \
+	    --directory $TEST_DIRECTORY/corpus.tags \
+	    --use-compress-program ${XZ} \
+	    --file ../download/notmuch-email-corpus-${PERFTEST_VERSION}.tar.xz \
+	    notmuch-email-corpus/tags
+    fi
+
+    cp -lr $TEST_DIRECTORY/corpus.tags $TMP_DIRECTORY
     cp -lr $CORPUS_DIR $MAIL_DIR
 
     if [ -d $DB_CACHE_DIR ]; then
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] perf-test: add nmbug tags to default database
  2012-12-04  4:18   ` [PATCH 1/4] perf-test: add corpus size to output, compact I/O stats david
  2012-12-04  4:18     ` [PATCH 2/4] perf-test: bump corpus version to 0.3 david
  2012-12-04  4:18     ` [PATCH 3/4] perf-test: unpack tags david
@ 2012-12-04  4:18     ` david
  2 siblings, 0 replies; 12+ messages in thread
From: david @ 2012-12-04  4:18 UTC (permalink / raw)
  To: notmuch; +Cc: David Bremner

From: David Bremner <bremner@debian.org>

This makes the tag set a bit less boring, and also acts as a benchmark
on its own.
---
 performance-test/basic |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/performance-test/basic b/performance-test/basic
index 41a7ff1..3225983 100755
--- a/performance-test/basic
+++ b/performance-test/basic
@@ -10,6 +10,8 @@ print_header
 
 time_run 'initial notmuch new' 'notmuch new'
 
+time_run 'load nmbug tags' 'notmuch restore --accumulate < corpus.tags/nmbug.sup-dump'
+
 cache_database
 
 time_run 'second notmuch new' 'notmuch new'
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] perf-test: bump corpus version to 0.3
  2012-12-04  4:18     ` [PATCH 2/4] perf-test: bump corpus version to 0.3 david
@ 2012-12-04  4:34       ` David Bremner
  0 siblings, 0 replies; 12+ messages in thread
From: David Bremner @ 2012-12-04  4:34 UTC (permalink / raw)
  To: notmuch

david@tethera.net writes:

> From: David Bremner <bremner@debian.org>
>
> The new version ships with some tags, and an updated archive of the
> notmuch mailing list.

Because of scarce disk space on notmuchmail.org, currently you have to
grab the corpus from 

http://tesseract.cs.unb.ca/notmuch/notmuch-email-corpus-0.3.tar.xz
http://tesseract.cs.unb.ca/notmuch/notmuch-email-corpus-0.3.tar.xz.asc

d

P.S. In partial answer to id:87txs4cy7v.fsf@nikula.org, the new optimization
seems to be just as fast, even with the additional tags.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] perf-test: cache unpacked corpus
  2012-12-04  1:17 ` [PATCH 2/3] perf-test: cache unpacked corpus david
@ 2012-12-05  4:55   ` Austin Clements
  0 siblings, 0 replies; 12+ messages in thread
From: Austin Clements @ 2012-12-05  4:55 UTC (permalink / raw)
  To: david, notmuch; +Cc: David Bremner

On Mon, 03 Dec 2012, david@tethera.net wrote:
> From: David Bremner <bremner@debian.org>
>
> Unpacking is not really the expensive step (compared to the initial
> notmuch new), but this is a pre-requisite to caching the database.
> ---
>  performance-test/.gitignore       |    1 +
>  performance-test/Makefile.local   |    2 +-
>  performance-test/perf-test-lib.sh |   51 +++++++++++++++++++++----------------
>  3 files changed, 31 insertions(+), 23 deletions(-)
>
> diff --git a/performance-test/.gitignore b/performance-test/.gitignore
> index 53f2697..7e20f7c 100644
> --- a/performance-test/.gitignore
> +++ b/performance-test/.gitignore
> @@ -1 +1,2 @@
>  tmp.*/
> +corpus.mail.*/
> diff --git a/performance-test/Makefile.local b/performance-test/Makefile.local
> index 5d2acbd..eb713d0 100644
> --- a/performance-test/Makefile.local
> +++ b/performance-test/Makefile.local
> @@ -29,4 +29,4 @@ $(TXZFILE):
>  download-corpus:
>  	wget -O ${TXZFILE} ${DEFAULT_URL}
>  
> -CLEAN := $(CLEAN) $(dir)/tmp.*
> +CLEAN := $(CLEAN) $(dir)/tmp.* $(dir)/corpus.mail.*
> diff --git a/performance-test/perf-test-lib.sh b/performance-test/perf-test-lib.sh
> index bba793d..9fbf874 100644
> --- a/performance-test/perf-test-lib.sh
> +++ b/performance-test/perf-test-lib.sh
> @@ -35,37 +35,44 @@ then
>  	exit 1
>  fi
>  
> +CORPUS_DIR=${TEST_DIRECTORY}/corpus.mail.$corpus_size
>  add_email_corpus ()
>  {
>      rm -rf ${MAIL_DIR}
> +    if [ ! -d $CORPUS_DIR ]; then
> +	case "$corpus_size" in
> +	    small)
> +		arg="mail/enron/bailey-s"
> +		;;
> +	    medium)
> +		arg="mail/notmuch-archive"
> +		;;
> +	    *)
> +		arg=mail
> +	esac
>  
> -    case "$1" in
> -	--small)
> -	    arg="mail/enron/bailey-s"
> -	    ;;
> -	--medium)
> -	    arg="mail/notmuch-archive"
> -	    ;;

The README still refers to these arguments, so it should be updated,
too.

> -	*)
> -	    arg=mail
> -    esac
> +	if command -v pixz > /dev/null; then
> +	    XZ=pixz
> +	else
> +	    XZ=xz
> +	fi
>  
> -    if command -v pixz > /dev/null; then
> -	XZ=pixz
> -    else
> -	XZ=xz
> -    fi
> +	printf "Unpacking corpus\n"
> +	mkdir $CORPUS_DIR
> +
> +	tar --checkpoint=.5000 --extract --strip-components=2 \
> +	    --directory $CORPUS_DIR \
> +	    --use-compress-program ${XZ} \
> +	    --file ../download/notmuch-email-corpus-${PERFTEST_VERSION}.tar.xz \
> +	    notmuch-email-corpus/"$arg"
>  
> -    printf "Unpacking corpus\n"
> -    tar --checkpoint=.5000 --extract --strip-components=1 \
> -	--directory ${TMP_DIRECTORY} \
> -	--use-compress-program ${XZ} \
> -	--file ../download/notmuch-email-corpus-${PERFTEST_VERSION}.tar.xz \
> -	notmuch-email-corpus/"$arg"
> +	printf "\n"
>  
> -    printf "\n"
> +    fi
> +    cp -lr $CORPUS_DIR $MAIL_DIR
>  }
>  
> +
>  print_header () {
>      printf "[v%4s]               Wall(s)\tUsr(s)\tSys(s)\tRes(K)\tIn(512B)\tOut(512B)\n" \
>  	   ${PERFTEST_VERSION}
> -- 
> 1.7.10.4
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] perf-test: add caching of xapian database.
  2012-12-04  1:17 ` [PATCH 3/3] perf-test: add caching of xapian database david
  2012-12-04  4:18   ` [PATCH 1/4] perf-test: add corpus size to output, compact I/O stats david
@ 2012-12-05  5:02   ` Austin Clements
  1 sibling, 0 replies; 12+ messages in thread
From: Austin Clements @ 2012-12-05  5:02 UTC (permalink / raw)
  To: david, notmuch; +Cc: David Bremner

This seems like an odd way to do this.  Timing the initial notmuch new
seems like the goal of exactly one performance test and irrelevant to
the others.  What about splitting "basic" into two tests: an explicit
notmuch new performance test and a separate test that does
dump/restore/tag.  The second test can always use a cached notmuch
database (creating it if necessary).  If you want to be really kind, the
notmuch new performance test could save its result as the database cache
if there isn't a database cache; then people don't have to build the
database twice if they (or notmuch-perf-test) run the tests in the right
order.

On Mon, 03 Dec 2012, david@tethera.net wrote:
> From: David Bremner <bremner@debian.org>
>
> The caching and uncaching seem to be necessarily manual, as timing the
> initial notmuch new is one of our goals with this suite.
> ---
>  performance-test/.gitignore       |    1 +
>  performance-test/Makefile.local   |    2 +-
>  performance-test/basic            |    5 +++++
>  performance-test/perf-test-lib.sh |   18 ++++++++++++++++++
>  4 files changed, 25 insertions(+), 1 deletion(-)
>
> diff --git a/performance-test/.gitignore b/performance-test/.gitignore
> index 7e20f7c..779a115 100644
> --- a/performance-test/.gitignore
> +++ b/performance-test/.gitignore
> @@ -1,2 +1,3 @@
>  tmp.*/
>  corpus.mail.*/
> +notmuch.cache.*/
> diff --git a/performance-test/Makefile.local b/performance-test/Makefile.local
> index eb713d0..b136a88 100644
> --- a/performance-test/Makefile.local
> +++ b/performance-test/Makefile.local
> @@ -29,4 +29,4 @@ $(TXZFILE):
>  download-corpus:
>  	wget -O ${TXZFILE} ${DEFAULT_URL}
>  
> -CLEAN := $(CLEAN) $(dir)/tmp.* $(dir)/corpus.mail.*
> +CLEAN := $(CLEAN) $(dir)/tmp.* $(dir)/corpus.mail.* $(dir)/notmuch.cache.*
> diff --git a/performance-test/basic b/performance-test/basic
> index 9d015ee..41a7ff1 100755
> --- a/performance-test/basic
> +++ b/performance-test/basic
> @@ -2,11 +2,16 @@
>  
>  . ./perf-test-lib.sh
>  
> +uncache_database
> +
>  add_email_corpus
>  
>  print_header
>  
>  time_run 'initial notmuch new' 'notmuch new'
> +
> +cache_database
> +
>  time_run 'second notmuch new' 'notmuch new'
>  time_run 'dump *' 'notmuch dump > tags.out'
>  time_run 'restore *' 'notmuch restore < tags.out'
> diff --git a/performance-test/perf-test-lib.sh b/performance-test/perf-test-lib.sh
> index 9fbf874..c9b131a 100644
> --- a/performance-test/perf-test-lib.sh
> +++ b/performance-test/perf-test-lib.sh
> @@ -36,6 +36,8 @@ then
>  fi
>  
>  CORPUS_DIR=${TEST_DIRECTORY}/corpus.mail.$corpus_size
> +DB_CACHE_DIR=${TEST_DIRECTORY}/notmuch.cache.$corpus_size
> +
>  add_email_corpus ()
>  {
>      rm -rf ${MAIL_DIR}
> @@ -69,9 +71,25 @@ add_email_corpus ()
>  	printf "\n"
>  
>      fi
> +
>      cp -lr $CORPUS_DIR $MAIL_DIR
> +
> +    if [ -d $DB_CACHE_DIR ]; then
> +	cp -r $DB_CACHE_DIR ${MAIL_DIR}/.notmuch
> +    fi
>  }
>  
> +cache_database () {
> +    if [ -d $MAIL_DIR/.notmuch ]; then
> +	cp -r $MAIL_DIR/.notmuch $DB_CACHE_DIR
> +    else
> +	echo "Warning: No database found to cache"
> +    fi
> +}
> +
> +uncache_database () {
> +    rm -rf $DB_CACHE_DIR
> +}
>  
>  print_header () {
>      printf "[v%4s]               Wall(s)\tUsr(s)\tSys(s)\tRes(K)\tIn(512B)\tOut(512B)\n" \
> -- 
> 1.7.10.4
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] perf-test: unpack tags.
  2012-12-04  4:18     ` [PATCH 3/4] perf-test: unpack tags david
@ 2012-12-05  5:23       ` Austin Clements
  2012-12-05 12:23         ` David Bremner
  0 siblings, 1 reply; 12+ messages in thread
From: Austin Clements @ 2012-12-05  5:23 UTC (permalink / raw)
  To: david, notmuch; +Cc: David Bremner

On Mon, 03 Dec 2012, david@tethera.net wrote:
> From: David Bremner <bremner@debian.org>
>
> It's a bit annoying to call tar twice, but we cache the results so it
> isn't as bad as it could be.
> ---
>  performance-test/Makefile.local   |    1 +
>  performance-test/perf-test-lib.sh |   25 +++++++++++++++++++------
>  2 files changed, 20 insertions(+), 6 deletions(-)
>
> diff --git a/performance-test/Makefile.local b/performance-test/Makefile.local
> index b136a88..cdd7f19 100644
> --- a/performance-test/Makefile.local
> +++ b/performance-test/Makefile.local
> @@ -30,3 +30,4 @@ download-corpus:
>  	wget -O ${TXZFILE} ${DEFAULT_URL}
>  
>  CLEAN := $(CLEAN) $(dir)/tmp.* $(dir)/corpus.mail.* $(dir)/notmuch.cache.*
> +CLEAN := $(CLEAN) $(dir)/corpus.tags
> diff --git a/performance-test/perf-test-lib.sh b/performance-test/perf-test-lib.sh
> index 08e2ebd..40c88c9 100644
> --- a/performance-test/perf-test-lib.sh
> +++ b/performance-test/perf-test-lib.sh
> @@ -41,6 +41,13 @@ DB_CACHE_DIR=${TEST_DIRECTORY}/notmuch.cache.$corpus_size
>  add_email_corpus ()
>  {
>      rm -rf ${MAIL_DIR}
> +
> +    if command -v pixz > /dev/null; then
> +	XZ=pixz
> +    else
> +	XZ=xz
> +    fi
> +
>      if [ ! -d $CORPUS_DIR ]; then
>  	case "$corpus_size" in
>  	    small)
> @@ -53,12 +60,6 @@ add_email_corpus ()
>  		arg=mail
>  	esac
>  
> -	if command -v pixz > /dev/null; then
> -	    XZ=pixz
> -	else
> -	    XZ=xz
> -	fi
> -
>  	printf "Unpacking corpus\n"
>  	mkdir $CORPUS_DIR
>  
> @@ -72,6 +73,18 @@ add_email_corpus ()
>  
>      fi
>  
> +    if [ ! -d $TEST_DIRECTORY/corpus.tags ]; then
> +
> +	mkdir $TEST_DIRECTORY/corpus.tags
> +
> +	tar --extract --strip-components=2 \
> +	    --directory $TEST_DIRECTORY/corpus.tags \
> +	    --use-compress-program ${XZ} \
> +	    --file ../download/notmuch-email-corpus-${PERFTEST_VERSION}.tar.xz \
> +	    notmuch-email-corpus/tags

Why not --strip-components=1 and unpack both mail/ and tags/ into a
single, shared corpus cache directory in one call to tar?  Since you're
going to cp -lr things anyway, you can structure the corpus cache
however is convenient.

> +    fi
> +
> +    cp -lr $TEST_DIRECTORY/corpus.tags $TMP_DIRECTORY
>      cp -lr $CORPUS_DIR $MAIL_DIR
>  
>      if [ -d $DB_CACHE_DIR ]; then
> -- 
> 1.7.10.4
>
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] perf-test: unpack tags.
  2012-12-05  5:23       ` Austin Clements
@ 2012-12-05 12:23         ` David Bremner
  0 siblings, 0 replies; 12+ messages in thread
From: David Bremner @ 2012-12-05 12:23 UTC (permalink / raw)
  To: Austin Clements, notmuch

Austin Clements <aclements@csail.mit.edu> writes:

> On Mon, 03 Dec 2012, david@tethera.net wrote:
>> From: David Bremner <bremner@debian.org>
>>
>> It's a bit annoying to call tar twice, but we cache the results so it
>> isn't as bad as it could be.
> Why not --strip-components=1 and unpack both mail/ and tags/ into a
> single, shared corpus cache directory in one call to tar?  Since you're
> going to cp -lr things anyway, you can structure the corpus cache
> however is convenient.

It's a good suggestion. The only downside is duplicating the tags. I
suppose on the scale of things that isn't a very big waste of space; the
tag corpus is currently 300k, which is dwarfed by even the "small"
corpus.

d

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-12-05 12:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-04  1:17 [PATCH 1/3] performance-test: add argument parsing for performance tests david
2012-12-04  1:17 ` [PATCH 2/3] perf-test: cache unpacked corpus david
2012-12-05  4:55   ` Austin Clements
2012-12-04  1:17 ` [PATCH 3/3] perf-test: add caching of xapian database david
2012-12-04  4:18   ` [PATCH 1/4] perf-test: add corpus size to output, compact I/O stats david
2012-12-04  4:18     ` [PATCH 2/4] perf-test: bump corpus version to 0.3 david
2012-12-04  4:34       ` David Bremner
2012-12-04  4:18     ` [PATCH 3/4] perf-test: unpack tags david
2012-12-05  5:23       ` Austin Clements
2012-12-05 12:23         ` David Bremner
2012-12-04  4:18     ` [PATCH 4/4] perf-test: add nmbug tags to default database david
2012-12-05  5:02   ` [PATCH 3/3] perf-test: add caching of xapian database Austin Clements

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).