* [PATCH] cuirass: Perform some database "optimization" at startup.
@ 2020-05-25 10:15 Christopher Baines
2020-05-25 10:48 ` Danny Milosavljevic
0 siblings, 1 reply; 3+ messages in thread
From: Christopher Baines @ 2020-05-25 10:15 UTC (permalink / raw)
To: guix-devel
Add a "optimize" step that occurs when starting up the main Curiass
process. Currently this does two things, but could be extended to do more.
The "PRAGMA optimize;" command prompts SQLite to ANALYZE tables where that
might help. The "PRAGMA wal_checkpoint(TRUNCATE);" command has SQLite process
any unprocessed changes from the WAL file, then truncate it to 0 bytes. I've
got no data to suggest this helps with performance, but I'm hoping that going
from a large WAL file to a small one occasionally might be useful.
* src/cuirass/database.scm (db-optimize): New procedure.
* bin/cuirass.in (main): Run it.
---
bin/cuirass.in | 4 ++++
src/cuirass/database.scm | 8 ++++++++
2 files changed, 12 insertions(+)
diff --git a/bin/cuirass.in b/bin/cuirass.in
index fbc7c3c..7a2d5ae 100644
--- a/bin/cuirass.in
+++ b/bin/cuirass.in
@@ -124,6 +124,10 @@ exec ${GUILE:-@GUILE@} --no-auto-compile -e main -s "$0" "$@"
(min (current-processor-count) 4))))
(prepare-git)
+ (unless (option-ref opts 'web #f)
+ (log-message "performing database optimizations")
+ (db-optimize))
+
(log-message "running Fibers on ~a kernel threads" threads)
(run-fibers
(lambda ()
diff --git a/src/cuirass/database.scm b/src/cuirass/database.scm
index f80585e..e81ead0 100644
--- a/src/cuirass/database.scm
+++ b/src/cuirass/database.scm
@@ -38,6 +38,7 @@
db-init
db-open
db-close
+ db-optimize
db-add-specification
db-remove-specification
db-get-specifications
@@ -277,6 +278,13 @@ database object."
"Close database object DB."
(sqlite-close db))
+(define* (db-optimize #:optional (db-file (%package-database)))
+ "Open the database and perform optimizations."
+ (let ((db (db-open db-file)))
+ (sqlite-exec db "PRAGMA optimize;")
+ (sqlite-exec db "PRAGMA wal_checkpoint(TRUNCATE);")
+ (db-close db)))
+
(define (last-insert-rowid db)
(vector-ref (car (sqlite-exec db "SELECT last_insert_rowid();"))
0))
--
2.26.2
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] cuirass: Perform some database "optimization" at startup.
2020-05-25 10:15 [PATCH] cuirass: Perform some database "optimization" at startup Christopher Baines
@ 2020-05-25 10:48 ` Danny Milosavljevic
2020-05-25 12:42 ` Christopher Baines
0 siblings, 1 reply; 3+ messages in thread
From: Danny Milosavljevic @ 2020-05-25 10:48 UTC (permalink / raw)
To: Christopher Baines; +Cc: guix-devel
[-- Attachment #1: Type: text/plain, Size: 676 bytes --]
Hi Chris,
the docs at https://www.sqlite.org/pragma.html#pragma_optimize suggest to run
"PRAGMA optimize" at the end of the connection, or periodically--not at the
beginning.
That makes sense since it has to be able to see which queries are emitted
in order to know what to optimize.
Also, docs say:
> The query planner used sqlite_stat1-style statistics for one or more indexes
> of the table at some point during the lifetime of the current connection.
That probably means one would have had to run ANALYZE at some point in the past.
Replaying the WAL sounds like a good idea at the beginning, though. Most
journalling filesystems do that too.
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] cuirass: Perform some database "optimization" at startup.
2020-05-25 10:48 ` Danny Milosavljevic
@ 2020-05-25 12:42 ` Christopher Baines
0 siblings, 0 replies; 3+ messages in thread
From: Christopher Baines @ 2020-05-25 12:42 UTC (permalink / raw)
To: Danny Milosavljevic; +Cc: guix-devel
[-- Attachment #1: Type: text/plain, Size: 1724 bytes --]
Danny Milosavljevic <dannym@scratchpost.org> writes:
> the docs at https://www.sqlite.org/pragma.html#pragma_optimize suggest to run
> "PRAGMA optimize" at the end of the connection, or periodically--not at the
> beginning.
>
> That makes sense since it has to be able to see which queries are emitted
> in order to know what to optimize.
>
> Also, docs say:
>
>> The query planner used sqlite_stat1-style statistics for one or more indexes
>> of the table at some point during the lifetime of the current connection.
>
> That probably means one would have had to run ANALYZE at some point in the past.
Thanks Danny, this is interesting.
From my reading of the docs, I think the only thing the optimize pragma
is going to do is run ANALYZE on some tables. There's something about
the current connection referenced in "Determination Of When To Run
Analyze", but it's not the only thing that triggers this.
There is some stuff mentioned about recent queries, but it seems to be
prefixed with "(Not yet implemented)".
I'm not sure where this would fit in the Cuirass code when connections
are closed, as I'm not sure where the connections are closed! In terms
of running this regularly, I'm up for trying to work that in, maybe it
could happen after new data has been added, or something like that.
Although I don't have any evidence to support this, I'm hoping running
the optimize pragma at startup will help in some cases, like when
migrations add new indexes, as I think the docs say SQLite will analyze
a table if an index hasn't been analyzed yet.
> Replaying the WAL sounds like a good idea at the beginning, though. Most
> journalling filesystems do that too.
Cool :)
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 962 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-05-25 12:43 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-05-25 10:15 [PATCH] cuirass: Perform some database "optimization" at startup Christopher Baines
2020-05-25 10:48 ` Danny Milosavljevic
2020-05-25 12:42 ` Christopher Baines
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.