unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [PATCH] cuirass: Perform some database "optimization" at startup.
@ 2020-05-25 10:15 Christopher Baines
  2020-05-25 10:48 ` Danny Milosavljevic
  0 siblings, 1 reply; 3+ messages in thread
From: Christopher Baines @ 2020-05-25 10:15 UTC (permalink / raw)
  To: guix-devel

Add a "optimize" step that occurs when starting up the main Curiass
process. Currently this does two things, but could be extended to do more.

The "PRAGMA optimize;" command prompts SQLite to ANALYZE tables where that
might help. The "PRAGMA wal_checkpoint(TRUNCATE);" command has SQLite process
any unprocessed changes from the WAL file, then truncate it to 0 bytes. I've
got no data to suggest this helps with performance, but I'm hoping that going
from a large WAL file to a small one occasionally might be useful.

* src/cuirass/database.scm (db-optimize): New procedure.
* bin/cuirass.in (main): Run it.
---
 bin/cuirass.in           | 4 ++++
 src/cuirass/database.scm | 8 ++++++++
 2 files changed, 12 insertions(+)

diff --git a/bin/cuirass.in b/bin/cuirass.in
index fbc7c3c..7a2d5ae 100644
--- a/bin/cuirass.in
+++ b/bin/cuirass.in
@@ -124,6 +124,10 @@ exec ${GUILE:-@GUILE@} --no-auto-compile -e main -s "$0" "$@"
                              (min (current-processor-count) 4))))
           (prepare-git)
 
+          (unless (option-ref opts 'web #f)
+            (log-message "performing database optimizations")
+            (db-optimize))
+
           (log-message "running Fibers on ~a kernel threads" threads)
           (run-fibers
            (lambda ()
diff --git a/src/cuirass/database.scm b/src/cuirass/database.scm
index f80585e..e81ead0 100644
--- a/src/cuirass/database.scm
+++ b/src/cuirass/database.scm
@@ -38,6 +38,7 @@
             db-init
             db-open
             db-close
+            db-optimize
             db-add-specification
             db-remove-specification
             db-get-specifications
@@ -277,6 +278,13 @@ database object."
   "Close database object DB."
   (sqlite-close db))
 
+(define* (db-optimize #:optional (db-file (%package-database)))
+  "Open the database and perform optimizations."
+  (let ((db (db-open db-file)))
+    (sqlite-exec db "PRAGMA optimize;")
+    (sqlite-exec db "PRAGMA wal_checkpoint(TRUNCATE);")
+    (db-close db)))
+
 (define (last-insert-rowid db)
   (vector-ref (car (sqlite-exec db "SELECT last_insert_rowid();"))
               0))
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] cuirass: Perform some database "optimization" at startup.
  2020-05-25 10:15 [PATCH] cuirass: Perform some database "optimization" at startup Christopher Baines
@ 2020-05-25 10:48 ` Danny Milosavljevic
  2020-05-25 12:42   ` Christopher Baines
  0 siblings, 1 reply; 3+ messages in thread
From: Danny Milosavljevic @ 2020-05-25 10:48 UTC (permalink / raw)
  To: Christopher Baines; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 676 bytes --]

Hi Chris,

the docs at https://www.sqlite.org/pragma.html#pragma_optimize suggest to run
"PRAGMA optimize" at the end of the connection, or periodically--not at the
beginning.

That makes sense since it has to be able to see which queries are emitted
in order to know what to optimize.

Also, docs say:

> The query planner used sqlite_stat1-style statistics for one or more indexes
> of the table at some point during the lifetime of the current connection. 

That probably means one would have had to run ANALYZE at some point in the past.


Replaying the WAL sounds like a good idea at the beginning, though.  Most
journalling filesystems do that too.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] cuirass: Perform some database "optimization" at startup.
  2020-05-25 10:48 ` Danny Milosavljevic
@ 2020-05-25 12:42   ` Christopher Baines
  0 siblings, 0 replies; 3+ messages in thread
From: Christopher Baines @ 2020-05-25 12:42 UTC (permalink / raw)
  To: Danny Milosavljevic; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 1724 bytes --]


Danny Milosavljevic <dannym@scratchpost.org> writes:

> the docs at https://www.sqlite.org/pragma.html#pragma_optimize suggest to run
> "PRAGMA optimize" at the end of the connection, or periodically--not at the
> beginning.
>
> That makes sense since it has to be able to see which queries are emitted
> in order to know what to optimize.
>
> Also, docs say:
>
>> The query planner used sqlite_stat1-style statistics for one or more indexes
>> of the table at some point during the lifetime of the current connection.
>
> That probably means one would have had to run ANALYZE at some point in the past.

Thanks Danny, this is interesting.

From my reading of the docs, I think the only thing the optimize pragma
is going to do is run ANALYZE on some tables. There's something about
the current connection referenced in "Determination Of When To Run
Analyze", but it's not the only thing that triggers this.

There is some stuff mentioned about recent queries, but it seems to be
prefixed with "(Not yet implemented)".

I'm not sure where this would fit in the Cuirass code when connections
are closed, as I'm not sure where the connections are closed! In terms
of running this regularly, I'm up for trying to work that in, maybe it
could happen after new data has been added, or something like that.

Although I don't have any evidence to support this, I'm hoping running
the optimize pragma at startup will help in some cases, like when
migrations add new indexes, as I think the docs say SQLite will analyze
a table if an index hasn't been analyzed yet.

> Replaying the WAL sounds like a good idea at the beginning, though.  Most
> journalling filesystems do that too.

Cool :)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 962 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-05-25 12:43 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-25 10:15 [PATCH] cuirass: Perform some database "optimization" at startup Christopher Baines
2020-05-25 10:48 ` Danny Milosavljevic
2020-05-25 12:42   ` Christopher Baines

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).