unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
* [bug#41658] [PATCH] fixes / improvements for (guix store database)
@ 2020-06-02  6:31 Caleb Ristvedt
  2020-06-04 16:40 ` Ludovic Courtès
  0 siblings, 1 reply; 6+ messages in thread
From: Caleb Ristvedt @ 2020-06-02  6:31 UTC (permalink / raw)
  To: 41658


[-- Attachment #1.1: Type: text/plain, Size: 471 bytes --]

After some pondering about why the database might be locked so
frequently, this is what I've managed to come up with. The first patch
is the most likely to actually help with that, and the others mostly
involve improving robustness.

Ideally we'd come up with a test to quantify how much these kinds of
changes affect contention over the database. For now, though, all that I
can think of is seeing how this affects the systems that have had issues
with that.

- reepca


[-- Attachment #1.2: 0001-database-work-around-guile-sqlite3-bug-preventing-st.patch --]
[-- Type: text/x-patch, Size: 3299 bytes --]

From cce653c590be1506e15044e445aa9805370ac759 Mon Sep 17 00:00:00 2001
From: Caleb Ristvedt <caleb.ristvedt@cune.org>
Date: Mon, 1 Jun 2020 18:50:07 -0500
Subject: [PATCH 1/4] database: work around guile-sqlite3 bug preventing
 statement reset
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

guile-sqlite3 provides statement caching, making it unnecessary for sqlite to
keep re-preparing statements that are frequently used.  Unfortunately it
doesn't quite emulate the semantics of sqlite_finalize properly, because it
doesn't cause a commit if the statement being finalized is the last "active"
statement.  We work around this by wrapping sqlite-finalize with our own
version that ensures sqlite-reset is called, which does The Right Thing™.

* guix/store/database.scm (sqlite-finalize): new procedure that shadows the
  sqlite-finalize from (sqlite3).
---
 guix/store/database.scm | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/guix/store/database.scm b/guix/store/database.scm
index ef52036ede..d4251e580e 100644
--- a/guix/store/database.scm
+++ b/guix/store/database.scm
@@ -130,6 +130,36 @@ transaction after it finishes."
 If FILE doesn't exist, create it and initialize it as a new database."
   (call-with-database file (lambda (db) exp ...)))
 
+(define (sqlite-finalize stmt)
+  ;; Cached statements aren't reset when sqlite-finalize is invoked on
+  ;; them. This can cause problems with automatically-started transactions:
+  ;;
+  ;; "An implicit transaction (a transaction that is started automatically,
+  ;; not a transaction started by BEGIN) is committed automatically when the
+  ;; last active statement finishes. A statement finishes when its last cursor
+  ;; closes, which is guaranteed to happen when the prepared statement is
+  ;; reset or finalized. Some statements might "finish" for the purpose of
+  ;; transaction control prior to being reset or finalized, but there is no
+  ;; guarantee of this."
+  ;;
+  ;; Thus, it's possible for an implicitly-started transaction to hang around
+  ;; until sqlite-reset is called when the cached statement is next
+  ;; used. Because the transaction is committed automatically only when the
+  ;; *last active statement* finishes, the implicitly-started transaction may
+  ;; later be upgraded to a write transaction (!) and this non-reset statement
+  ;; will still be keeping the transaction from committing until it is next
+  ;; used or the database connection is closed. This has the potential to make
+  ;; (exclusive) write access to the database necessary for much longer than
+  ;; it should be.
+  ;;
+  ;; (see https://www.sqlite.org/lang_transaction.html)
+  ;; To work around this, we wrap sqlite-finalize so that sqlite-reset is
+  ;; always called. This will continue working even when the behavior is fixed
+  ;; in guile-sqlite3, since resetting twice doesn't cause any problems. We
+  ;; can remove this once the fixed guile-sqlite3 is widespread.
+  (sqlite-reset stmt)
+  ((@ (sqlite3) sqlite-finalize) stmt))
+
 (define (last-insert-row-id db)
   ;; XXX: (sqlite3) currently lacks bindings for 'sqlite3_last_insert_rowid'.
   ;; Work around that.
-- 
2.26.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.3: 0002-database-rewrite-query-procedures-in-terms-of-with-s.patch --]
[-- Type: text/x-patch, Size: 5878 bytes --]

From ee24ab21122b1c75a7d67d7062550e15e54ab62f Mon Sep 17 00:00:00 2001
From: Caleb Ristvedt <caleb.ristvedt@cune.org>
Date: Mon, 1 Jun 2020 19:21:43 -0500
Subject: [PATCH 2/4] database: rewrite query procedures in terms of
 with-statement.

Most of our queries would fail to finalize their statements properly if sqlite
returned an error during their execution.  This resolves that, and also makes
them somewhat more concise as a side-effect.

This also makes some small changes to improve certain queries where behavior
was strange or overly verbose.

* guix/store/database.scm (call-with-statement): new procedure.
  (with-statement): new macro.
  (last-insert-row-id, path-id, update-or-insert, add-references): rewrite to
  use with-statement.
  (update-or-insert): factor last-insert-row-id out of the end of both
  branches.
  (add-references): remove pointless last-insert-row-id call.

* .dir-locals.el (with-statement): add indenting information.
---
 .dir-locals.el          |  1 +
 guix/store/database.scm | 53 ++++++++++++++++++++++-------------------
 2 files changed, 30 insertions(+), 24 deletions(-)

diff --git a/.dir-locals.el b/.dir-locals.el
index fcde914e60..a085269e85 100644
--- a/.dir-locals.el
+++ b/.dir-locals.el
@@ -89,6 +89,7 @@
 
    (eval . (put 'with-database 'scheme-indent-function 2))
    (eval . (put 'call-with-transaction 'scheme-indent-function 2))
+   (eval . (put 'with-statement 'scheme-indent-function 3))
 
    (eval . (put 'call-with-container 'scheme-indent-function 1))
    (eval . (put 'container-excursion 'scheme-indent-function 1))
diff --git a/guix/store/database.scm b/guix/store/database.scm
index d4251e580e..2209da3df1 100644
--- a/guix/store/database.scm
+++ b/guix/store/database.scm
@@ -160,14 +160,26 @@ If FILE doesn't exist, create it and initialize it as a new database."
   (sqlite-reset stmt)
   ((@ (sqlite3) sqlite-finalize) stmt))
 
+(define (call-with-statement db sql proc)
+  (let ((stmt (sqlite-prepare db sql #:cache? #t)))
+    (dynamic-wind
+      (const #t)
+      (lambda ()
+        (proc stmt))
+      (lambda ()
+        (sqlite-finalize stmt)))))
+
+(define-syntax-rule (with-statement db sql stmt exp ...)
+  "Run EXP... with STMT bound to a prepared statement corresponding to the sql
+string SQL for DB."
+  (call-with-statement db sql
+                       (lambda (stmt) exp ...)))
+
 (define (last-insert-row-id db)
   ;; XXX: (sqlite3) currently lacks bindings for 'sqlite3_last_insert_rowid'.
   ;; Work around that.
-  (let* ((stmt   (sqlite-prepare db "SELECT last_insert_rowid();"
-                                 #:cache? #t))
-         (result (sqlite-fold cons '() stmt)))
-    (sqlite-finalize stmt)
-    (match result
+  (with-statement db "SELECT last_insert_rowid();" stmt
+    (match (sqlite-fold cons '() stmt)
       ((#(id)) id)
       (_ #f))))
 
@@ -177,13 +189,11 @@ If FILE doesn't exist, create it and initialize it as a new database."
 (define* (path-id db path)
   "If PATH exists in the 'ValidPaths' table, return its numerical
 identifier.  Otherwise, return #f."
-  (let ((stmt (sqlite-prepare db path-id-sql #:cache? #t)))
+  (with-statement db path-id-sql stmt
     (sqlite-bind-arguments stmt #:path path)
-    (let ((result (sqlite-fold cons '() stmt)))
-      (sqlite-finalize stmt)
-      (match result
-        ((#(id) . _) id)
-        (_ #f)))))
+    (match (sqlite-fold cons '() stmt)
+      ((#(id) . _) id)
+      (_ #f))))
 
 (define update-sql
   "UPDATE ValidPaths SET hash = :hash, registrationTime = :time, deriver =
@@ -200,20 +210,17 @@ and re-inserting instead of updating, which causes problems with foreign keys,
 of course. Returns the row id of the row that was modified or inserted."
   (let ((id (path-id db path)))
     (if id
-        (let ((stmt (sqlite-prepare db update-sql #:cache? #t)))
+        (with-statement db update-sql stmt
           (sqlite-bind-arguments stmt #:id id
                                  #:deriver deriver
                                  #:hash hash #:size nar-size #:time time)
-          (sqlite-fold cons '() stmt)
-          (sqlite-finalize stmt)
-          (last-insert-row-id db))
-        (let ((stmt (sqlite-prepare db insert-sql #:cache? #t)))
+          (sqlite-fold cons '() stmt))
+        (with-statement db insert-sql stmt
           (sqlite-bind-arguments stmt
                                  #:path path #:deriver deriver
                                  #:hash hash #:size nar-size #:time time)
-          (sqlite-fold cons '() stmt)             ;execute it
-          (sqlite-finalize stmt)
-          (last-insert-row-id db)))))
+          (sqlite-fold cons '() stmt)))
+    (last-insert-row-id db)))
 
 (define add-reference-sql
   "INSERT OR REPLACE INTO Refs (referrer, reference) VALUES (:referrer, :reference);")
@@ -221,15 +228,13 @@ of course. Returns the row id of the row that was modified or inserted."
 (define (add-references db referrer references)
   "REFERRER is the id of the referring store item, REFERENCES is a list
 ids of items referred to."
-  (let ((stmt (sqlite-prepare db add-reference-sql #:cache? #t)))
+  (with-statement db add-reference-sql stmt
     (for-each (lambda (reference)
                 (sqlite-reset stmt)
                 (sqlite-bind-arguments stmt #:referrer referrer
                                        #:reference reference)
-                (sqlite-fold cons '() stmt)       ;execute it
-                (last-insert-row-id db))
-              references)
-    (sqlite-finalize stmt)))
+                (sqlite-fold cons '() stmt))
+              references)))
 
 (define* (sqlite-register db #:key path (references '())
                           deriver hash nar-size time)
-- 
2.26.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.4: 0003-database-ensure-update-or-insert-is-run-within-a-tra.patch --]
[-- Type: text/x-patch, Size: 5730 bytes --]

From 7d34c27c33aed3e8a49b9796a62a8c19d352e653 Mon Sep 17 00:00:00 2001
From: Caleb Ristvedt <caleb.ristvedt@cune.org>
Date: Mon, 1 Jun 2020 21:43:14 -0500
Subject: [PATCH 3/4] database: ensure update-or-insert is run within a
 transaction

update-or-insert can break if an insert occurs between when it decides whether
to update or insert and when it actually performs that operation.  Putting the
check and the update/insert operation in the same transaction ensures that the
update/insert will only succeed if no other write has occurred in the middle.

* guix/store/database.scm (call-with-savepoint): new procedure.
  (update-or-insert): use call-with-savepoint to ensure the read and the
  insert/update occur within the same transaction.
---
 .dir-locals.el          |  1 +
 guix/store/database.scm | 68 +++++++++++++++++++++++++++++++++--------
 2 files changed, 56 insertions(+), 13 deletions(-)

diff --git a/.dir-locals.el b/.dir-locals.el
index a085269e85..ef25cb100a 100644
--- a/.dir-locals.el
+++ b/.dir-locals.el
@@ -90,6 +90,7 @@
    (eval . (put 'with-database 'scheme-indent-function 2))
    (eval . (put 'call-with-transaction 'scheme-indent-function 2))
    (eval . (put 'with-statement 'scheme-indent-function 3))
+   (eval . (put 'call-with-savepoint 'scheme-indent-function 1))
 
    (eval . (put 'call-with-container 'scheme-indent-function 1))
    (eval . (put 'container-excursion 'scheme-indent-function 1))
diff --git a/guix/store/database.scm b/guix/store/database.scm
index 2209da3df1..3955c48b1f 100644
--- a/guix/store/database.scm
+++ b/guix/store/database.scm
@@ -120,6 +120,26 @@ transaction after it finishes."
           (begin
             (sqlite-exec db "rollback;")
             (throw 'sqlite-error who error description))))))
+(define* (call-with-savepoint db proc
+                              #:optional (savepoint-name "SomeSavepoint"))
+  "Call PROC after creating a savepoint named SAVEPOINT-NAME.  If PROC exits
+abnormally, rollback to that savepoint.  In all cases, remove the savepoint
+prior to returning."
+  (define (exec sql)
+    (with-statement db sql stmt
+      (sqlite-fold cons '() stmt)))
+
+  (dynamic-wind
+    (lambda ()
+      (exec (string-append "SAVEPOINT " savepoint-name ";")))
+    (lambda ()
+      (catch #t
+        proc
+        (lambda args
+          (exec (string-append "ROLLBACK TO " savepoint-name ";"))
+          (apply throw args))))
+    (lambda ()
+      (exec (string-append "RELEASE " savepoint-name ";")))))
 
 (define %default-database-file
   ;; Default location of the store database.
@@ -208,19 +228,41 @@ VALUES (:path, :hash, :time, :deriver, :size)")
 doesn't exactly have... they've got something close, but it involves deleting
 and re-inserting instead of updating, which causes problems with foreign keys,
 of course. Returns the row id of the row that was modified or inserted."
-  (let ((id (path-id db path)))
-    (if id
-        (with-statement db update-sql stmt
-          (sqlite-bind-arguments stmt #:id id
-                                 #:deriver deriver
-                                 #:hash hash #:size nar-size #:time time)
-          (sqlite-fold cons '() stmt))
-        (with-statement db insert-sql stmt
-          (sqlite-bind-arguments stmt
-                                 #:path path #:deriver deriver
-                                 #:hash hash #:size nar-size #:time time)
-          (sqlite-fold cons '() stmt)))
-    (last-insert-row-id db)))
+
+  ;; It's important that querying the path-id and the insert/update operation
+  ;; take place in the same transaction, as otherwise some other
+  ;; process/thread/fiber could register the same path between when we check
+  ;; whether it's already registered and when we register it, resulting in
+  ;; duplicate paths (which, due to a 'unique' constraint, would cause an
+  ;; exception to be thrown). With the default journaling mode this will
+  ;; prevent writes from occurring during that sensitive time, but with WAL
+  ;; mode it will instead arrange to return SQLITE_BUSY when a write occurs
+  ;; between the start of a read transaction and its upgrading to a write
+  ;; transaction (see https://sqlite.org/rescode.html#busy_snapshot).
+  ;; Experimentally, it seems this SQLITE_BUSY will ignore a busy_timeout and
+  ;; immediately return (makes sense, since waiting won't change anything).
+
+  ;; Note that when that kind of SQLITE_BUSY error is returned, it will keep
+  ;; being returned every time we try to upgrade the same outermost
+  ;; transaction to a write transaction.  So when retrying, we have to restart
+  ;; the *outermost* write transaction.  We can't inherently tell whether
+  ;; we're the outermost write transaction, so we leave the retry-handling to
+  ;; the caller.
+  (call-with-savepoint db
+    (lambda ()
+      (let ((id (path-id db path)))
+        (if id
+            (with-statement db update-sql stmt
+              (sqlite-bind-arguments stmt #:id id
+                                     #:deriver deriver
+                                     #:hash hash #:size nar-size #:time time)
+              (sqlite-fold cons '() stmt))
+            (with-statement db insert-sql stmt
+              (sqlite-bind-arguments stmt
+                                     #:path path #:deriver deriver
+                                     #:hash hash #:size nar-size #:time time)
+              (sqlite-fold cons '() stmt)))
+        (last-insert-row-id db)))))
 
 (define add-reference-sql
   "INSERT OR REPLACE INTO Refs (referrer, reference) VALUES (:referrer, :reference);")
-- 
2.26.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.5: 0004-database-separate-transaction-handling-and-retry-han.patch --]
[-- Type: text/x-patch, Size: 6216 bytes --]

From e30271728dfb23324c981d226c752b17689c9eef Mon Sep 17 00:00:00 2001
From: Caleb Ristvedt <caleb.ristvedt@cune.org>
Date: Mon, 1 Jun 2020 22:15:21 -0500
Subject: [PATCH 4/4] database: separate transaction-handling and
 retry-handling.

Previously call-with-transaction would both retry when SQLITE_BUSY errors were
thrown and do what its name suggested (start and rollback/commit a
transaction).  This changes it to do only what its name implies, which
simplifies its implementation.  Retrying is provided by the new
call-with-SQLITE_BUSY-retrying procedure.

* guix/store/database.scm (call-with-transaction): no longer restarts, new
  #:restartable? argument controls whether "begin" or "begin immediate" is
  used.
  (call-with-SQLITE_BUSY-retrying, call-with-retrying-transaction,
  call-with-retrying-savepoint): new procedures.
  (register-items): use call-with-retrying-transaction to preserve old
  behavior.

* .dir-locals.el (call-with-retrying-transaction,
  call-with-retrying-savepoint): add indentation information.
---
 .dir-locals.el          |  2 ++
 guix/store/database.scm | 69 +++++++++++++++++++++++++++++------------
 2 files changed, 51 insertions(+), 20 deletions(-)

diff --git a/.dir-locals.el b/.dir-locals.el
index ef25cb100a..e9dccd0511 100644
--- a/.dir-locals.el
+++ b/.dir-locals.el
@@ -90,7 +90,9 @@
    (eval . (put 'with-database 'scheme-indent-function 2))
    (eval . (put 'call-with-transaction 'scheme-indent-function 2))
    (eval . (put 'with-statement 'scheme-indent-function 3))
+   (eval . (put 'call-with-retrying-transaction 'scheme-indent-function 2))
    (eval . (put 'call-with-savepoint 'scheme-indent-function 1))
+   (eval . (put 'call-with-retrying-savepoint 'scheme-indent-function 1))
 
    (eval . (put 'call-with-container 'scheme-indent-function 1))
    (eval . (put 'container-excursion 'scheme-indent-function 1))
diff --git a/guix/store/database.scm b/guix/store/database.scm
index 3955c48b1f..2a78379dac 100644
--- a/guix/store/database.scm
+++ b/guix/store/database.scm
@@ -99,27 +99,44 @@ create it and initialize it as a new database."
 ;; XXX: missing in guile-sqlite3@0.1.0
 (define SQLITE_BUSY 5)
 
-(define (call-with-transaction db proc)
-  "Start a transaction with DB (make as many attempts as necessary) and run
-PROC.  If PROC exits abnormally, abort the transaction, otherwise commit the
-transaction after it finishes."
+(define (call-with-SQLITE_BUSY-retrying thunk)
+  "Call THUNK, retrying as long as it exits abnormally due to SQLITE_BUSY
+errors."
   (catch 'sqlite-error
+    thunk
+    (lambda (key who code errmsg)
+      (if (= code SQLITE_BUSY)
+          (call-with-SQLITE_BUSY-retrying thunk)
+          (throw key who code errmsg)))))
+
+
+
+(define* (call-with-transaction db proc #:key restartable?)
+  "Start a transaction with DB and run PROC.  If PROC exits abnormally, abort
+the transaction, otherwise commit the transaction after it finishes.
+RESTARTABLE? may be set to a non-#f value when it is safe to run PROC multiple
+times.  This may reduce contention for the database somewhat."
+  (define (exec sql)
+    (with-statement db sql stmt
+      (sqlite-fold cons '() stmt)))
+  ;; We might use begin immediate here so that if we need to retry, we figure
+  ;; that out immediately rather than because some SQLITE_BUSY exception gets
+  ;; thrown partway through PROC - in which case the part already executed
+  ;; (which may contain side-effects!) might have to be executed again for
+  ;; every retry.
+  (exec (if restartable? "begin;" "begin immediate;"))
+  (catch #t
     (lambda ()
-      ;; We use begin immediate here so that if we need to retry, we
-      ;; figure that out immediately rather than because some SQLITE_BUSY
-      ;; exception gets thrown partway through PROC - in which case the
-      ;; part already executed (which may contain side-effects!) would be
-      ;; executed again for every retry.
-      (sqlite-exec db "begin immediate;")
-      (let ((result (proc)))
-        (sqlite-exec db "commit;")
-        result))
-    (lambda (key who error description)
-      (if (= error SQLITE_BUSY)
-          (call-with-transaction db proc)
-          (begin
-            (sqlite-exec db "rollback;")
-            (throw 'sqlite-error who error description))))))
+      (let-values ((result (proc)))
+        (exec "commit;")
+        (apply values result)))
+    (lambda args
+      ;; The roll back may or may not have occurred automatically when the
+      ;; error was generated. If it has occurred, this does nothing but signal
+      ;; an error. If it hasn't occurred, this needs to be done.
+      (false-if-exception (exec "rollback;"))
+      (apply throw args))))
+
 (define* (call-with-savepoint db proc
                               #:optional (savepoint-name "SomeSavepoint"))
   "Call PROC after creating a savepoint named SAVEPOINT-NAME.  If PROC exits
@@ -141,6 +158,18 @@ prior to returning."
     (lambda ()
       (exec (string-append "RELEASE " savepoint-name ";")))))
 
+(define* (call-with-retrying-transaction db proc #:key restartable?)
+  (call-with-SQLITE_BUSY-retrying
+   (lambda ()
+     (call-with-transaction db proc #:restartable? restartable?))))
+
+(define* (call-with-retrying-savepoint db proc
+                                       #:optional (savepoint-name
+                                                   "SomeSavepoint"))
+  (call-with-SQLITE_BUSY-retrying
+   (lambda ()
+     (call-with-savepoint db proc savepoint-name))))
+
 (define %default-database-file
   ;; Default location of the store database.
   (string-append %store-database-directory "/db.sqlite"))
@@ -431,7 +460,7 @@ Write a progress report to LOG-PORT."
   (mkdir-p db-dir)
   (parameterize ((sql-schema schema))
     (with-database (string-append db-dir "/db.sqlite") db
-      (call-with-transaction db
+      (call-with-retrying-transaction db
           (lambda ()
             (let* ((prefix   (format #f "registering ~a items" (length items)))
                    (progress (progress-reporter/bar (length items)
-- 
2.26.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [bug#41658] [PATCH] fixes / improvements for (guix store database)
  2020-06-02  6:31 [bug#41658] [PATCH] fixes / improvements for (guix store database) Caleb Ristvedt
@ 2020-06-04 16:40 ` Ludovic Courtès
  2020-06-04 17:00   ` Danny Milosavljevic
  2020-06-08  5:52   ` Caleb Ristvedt
  0 siblings, 2 replies; 6+ messages in thread
From: Ludovic Courtès @ 2020-06-04 16:40 UTC (permalink / raw)
  To: Caleb Ristvedt; +Cc: 41658

Hi,

Thanks for the thorough investigation and for the patches!

Caleb Ristvedt <caleb.ristvedt@cune.org> skribis:

> From cce653c590be1506e15044e445aa9805370ac759 Mon Sep 17 00:00:00 2001
> From: Caleb Ristvedt <caleb.ristvedt@cune.org>
> Date: Mon, 1 Jun 2020 18:50:07 -0500
> Subject: [PATCH 1/4] database: work around guile-sqlite3 bug preventing
>  statement reset
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> guile-sqlite3 provides statement caching, making it unnecessary for sqlite to
> keep re-preparing statements that are frequently used.  Unfortunately it
> doesn't quite emulate the semantics of sqlite_finalize properly, because it
> doesn't cause a commit if the statement being finalized is the last "active"
> statement.  We work around this by wrapping sqlite-finalize with our own
> version that ensures sqlite-reset is called, which does The Right Thing™.
>
> * guix/store/database.scm (sqlite-finalize): new procedure that shadows the
>   sqlite-finalize from (sqlite3).

Nice.  It would be great if you could report it upstream (Danny and/or
myself can then patch it directly in guile-sqlite3 and push out a
release) and refer to the issue from here.

We can have this patch locally in the meantime, unless it would break
once the new guile-sqlite3 is out.  WDYT?

> From ee24ab21122b1c75a7d67d7062550e15e54ab62f Mon Sep 17 00:00:00 2001
> From: Caleb Ristvedt <caleb.ristvedt@cune.org>
> Date: Mon, 1 Jun 2020 19:21:43 -0500
> Subject: [PATCH 2/4] database: rewrite query procedures in terms of
>  with-statement.
>
> Most of our queries would fail to finalize their statements properly if sqlite
> returned an error during their execution.  This resolves that, and also makes
> them somewhat more concise as a side-effect.
>
> This also makes some small changes to improve certain queries where behavior
> was strange or overly verbose.
>
> * guix/store/database.scm (call-with-statement): new procedure.
>   (with-statement): new macro.
>   (last-insert-row-id, path-id, update-or-insert, add-references): rewrite to
>   use with-statement.
>   (update-or-insert): factor last-insert-row-id out of the end of both
>   branches.
>   (add-references): remove pointless last-insert-row-id call.
>
> * .dir-locals.el (with-statement): add indenting information.

LGTM!

> From 7d34c27c33aed3e8a49b9796a62a8c19d352e653 Mon Sep 17 00:00:00 2001
> From: Caleb Ristvedt <caleb.ristvedt@cune.org>
> Date: Mon, 1 Jun 2020 21:43:14 -0500
> Subject: [PATCH 3/4] database: ensure update-or-insert is run within a
>  transaction
>
> update-or-insert can break if an insert occurs between when it decides whether
> to update or insert and when it actually performs that operation.  Putting the
> check and the update/insert operation in the same transaction ensures that the
> update/insert will only succeed if no other write has occurred in the middle.
>
> * guix/store/database.scm (call-with-savepoint): new procedure.
>   (update-or-insert): use call-with-savepoint to ensure the read and the
>   insert/update occur within the same transaction.

That’s a bit beyond my understanding, but I think you can also push this
one.  :-)

Make sure “make check TESTS=tests/store-database.scm” is still happy.

Thanks a lot!

Ludo’.




^ permalink raw reply	[flat|nested] 6+ messages in thread

* [bug#41658] [PATCH] fixes / improvements for (guix store database)
  2020-06-04 16:40 ` Ludovic Courtès
@ 2020-06-04 17:00   ` Danny Milosavljevic
  2020-06-05 16:19     ` Ludovic Courtès
  2020-06-08  5:52   ` Caleb Ristvedt
  1 sibling, 1 reply; 6+ messages in thread
From: Danny Milosavljevic @ 2020-06-04 17:00 UTC (permalink / raw)
  To: Ludovic Courtès, Caleb Ristvedt; +Cc: 41658

[-- Attachment #1: Type: text/plain, Size: 2983 bytes --]

Hi Ludo,
Hi Caleb,

On Thu, 04 Jun 2020 18:40:35 +0200
Ludovic Courtès <ludo@gnu.org> wrote:

> Nice.  It would be great if you could report it upstream (Danny and/or
> myself can then patch it directly in guile-sqlite3 and push out a
> release) and refer to the issue from here.

I agree.  It's easy to change sqlite-finalize in guile-sqlite3 to
call sqlite-reset, basically just adapt

(define sqlite-finalize
  (let ((f (pointer->procedure
            int
            (dynamic-func "sqlite3_finalize" libsqlite3)
            (list '*))))
    (lambda (stmt)
      ;; Note: When STMT is cached, this is a no-op.  This ensures caching
      ;; actually works while still separating concerns: users can turn
      ;; caching on and off without having to change the rest of their code.
      (when (and (stmt-live? stmt)
                 (not (stmt-cached? stmt)))
        (let ((p (stmt-pointer stmt)))
          (sqlite-remove-statement! (stmt->db stmt) stmt)
          (set-stmt-live?! stmt #f)
          (f p))))))

so that it calls sqlite-reset in the "when"'s new "else" branch there.

(we could also always call sqlite3_reset on sqlite-finalize anyway, it wouldn't
hurt but it wouldn't help either)

I agree that sqlite-finalize should model sqlite's finalization behavior as
much as possible.

Also, the comment about this being a no-op is not true then anymore.

We should definitely also pick up Caleb's comment upstream:

+  ;; Cached statements aren't reset when sqlite-finalize is invoked on
+  ;; them. This can cause problems with automatically-started transactions:
+  ;;
+  ;; "An implicit transaction (a transaction that is started automatically,
+  ;; not a transaction started by BEGIN) is committed automatically when the
+  ;; last active statement finishes. A statement finishes when its last cursor
+  ;; closes, which is guaranteed to happen when the prepared statement is
+  ;; reset or finalized. Some statements might "finish" for the purpose of
+  ;; transaction control prior to being reset or finalized, but there is no
+  ;; guarantee of this."
+  ;;
+  ;; Thus, it's possible for an implicitly-started transaction to hang around
+  ;; until sqlite-reset is called when the cached statement is next
+  ;; used. Because the transaction is committed automatically only when the
+  ;; *last active statement* finishes, the implicitly-started transaction may
+  ;; later be upgraded to a write transaction (!) and this non-reset statement
+  ;; will still be keeping the transaction from committing until it is next
+  ;; used or the database connection is closed. This has the potential to make
+  ;; (exclusive) write access to the database necessary for much longer than
+  ;; it should be.
+  ;;
+  ;; (see https://www.sqlite.org/lang_transaction.html)

@Caleb:

Could you file an issue at https://notabug.org/guile-sqlite3/guile-sqlite3/issues
and pull request so this is auditable?

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [bug#41658] [PATCH] fixes / improvements for (guix store database)
  2020-06-04 17:00   ` Danny Milosavljevic
@ 2020-06-05 16:19     ` Ludovic Courtès
  0 siblings, 0 replies; 6+ messages in thread
From: Ludovic Courtès @ 2020-06-05 16:19 UTC (permalink / raw)
  To: Danny Milosavljevic; +Cc: Caleb Ristvedt, 41658

Hi Danny!

Danny Milosavljevic <dannym@scratchpost.org> skribis:

> I agree.  It's easy to change sqlite-finalize in guile-sqlite3 to
> call sqlite-reset, basically just adapt

[...]

> @Caleb:
>
> Could you file an issue at https://notabug.org/guile-sqlite3/guile-sqlite3/issues
> and pull request so this is auditable?

Agreed.

Danny, once this is merged upstream, could you tag a new release?  There
are a few other useful improvements in there.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 6+ messages in thread

* [bug#41658] [PATCH] fixes / improvements for (guix store database)
  2020-06-04 16:40 ` Ludovic Courtès
  2020-06-04 17:00   ` Danny Milosavljevic
@ 2020-06-08  5:52   ` Caleb Ristvedt
  2020-06-09  8:42     ` Ludovic Courtès
  1 sibling, 1 reply; 6+ messages in thread
From: Caleb Ristvedt @ 2020-06-08  5:52 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: Danny Milosavljevic, 41658


[-- Attachment #1.1: Type: text/plain, Size: 3908 bytes --]

Ludovic Courtès <ludo@gnu.org> writes:

> Hi,
>
> Thanks for the thorough investigation and for the patches!
>
> Caleb Ristvedt <caleb.ristvedt@cune.org> skribis:
>
>> From cce653c590be1506e15044e445aa9805370ac759 Mon Sep 17 00:00:00 2001
>> From: Caleb Ristvedt <caleb.ristvedt@cune.org>
>> Date: Mon, 1 Jun 2020 18:50:07 -0500
>> Subject: [PATCH 1/4] database: work around guile-sqlite3 bug preventing
>>  statement reset
>> MIME-Version: 1.0
>> Content-Type: text/plain; charset=UTF-8
>> Content-Transfer-Encoding: 8bit
>>
>> guile-sqlite3 provides statement caching, making it unnecessary for sqlite to
>> keep re-preparing statements that are frequently used.  Unfortunately it
>> doesn't quite emulate the semantics of sqlite_finalize properly, because it
>> doesn't cause a commit if the statement being finalized is the last "active"
>> statement.  We work around this by wrapping sqlite-finalize with our own
>> version that ensures sqlite-reset is called, which does The Right Thing™.
>>
>> * guix/store/database.scm (sqlite-finalize): new procedure that shadows the
>>   sqlite-finalize from (sqlite3).
>
> Nice.  It would be great if you could report it upstream (Danny and/or
> myself can then patch it directly in guile-sqlite3 and push out a
> release) and refer to the issue from here.

Reported as https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12,
with corresponding pull request
https://notabug.org/guile-sqlite3/guile-sqlite3/pulls/13.

> We can have this patch locally in the meantime, unless it would break
> once the new guile-sqlite3 is out.  WDYT?

I've attached an updated patch series that both includes the
guile-sqlite3 fix as a patch to the guile-sqlite3 package and adopts the
workaround for situations where older guile-sqlite3's must be used (for
example, when building guix from scratch on foreign distros that haven't
incorporated the fix yet). The only changes are the addition of the
now-second patch and fixing up some spacing in the comment in the first
patch.

>> From 7d34c27c33aed3e8a49b9796a62a8c19d352e653 Mon Sep 17 00:00:00 2001
>> From: Caleb Ristvedt <caleb.ristvedt@cune.org>
>> Date: Mon, 1 Jun 2020 21:43:14 -0500
>> Subject: [PATCH 3/4] database: ensure update-or-insert is run within a
>>  transaction
>>
>> update-or-insert can break if an insert occurs between when it decides whether
>> to update or insert and when it actually performs that operation.  Putting the
>> check and the update/insert operation in the same transaction ensures that the
>> update/insert will only succeed if no other write has occurred in the middle.
>>
>> * guix/store/database.scm (call-with-savepoint): new procedure.
>>   (update-or-insert): use call-with-savepoint to ensure the read and the
>>   insert/update occur within the same transaction.
>
> That’s a bit beyond my understanding, but I think you can also push this
> one.  :-)

Basically, it's like combining the body of two separate compare-and-swap
loops into a single compare-and-swap loop. This ensures that the view is
consistent (since if it isn't, the "compare" will fail and we'll
retry). It addresses a problem that doesn't exist in practice yet, since
update-or-insert is only called from within a call-with-transaction
currently. But if someone ever wanted to call it from outside of a
call-with-transaction, this would ensure that it still worked correctly.

> Make sure “make check TESTS=tests/store-database.scm” is still happy.

Works on my system.

Related question: does berlin export /var/guix over NFS as per
http://hpc.guix.info/blog/2017/11/installing-guix-on-a-cluster? If so,
that could interact poorly with our use of WAL mode:

"All processes using a database must be on the same host computer; WAL
does not work over a network filesystem." -
https://sqlite.org/wal.html.

- reepca


[-- Attachment #1.2: 0001-database-work-around-guile-sqlite3-bug-preventing-st.patch --]
[-- Type: text/x-patch, Size: 3478 bytes --]

From 614213c80a7ea15f7aab9502e6c33206ac089d05 Mon Sep 17 00:00:00 2001
From: Caleb Ristvedt <caleb.ristvedt@cune.org>
Date: Mon, 1 Jun 2020 18:50:07 -0500
Subject: [PATCH 1/5] database: work around guile-sqlite3 bug preventing
 statement reset
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

guile-sqlite3 provides statement caching, making it unnecessary for sqlite to
keep re-preparing statements that are frequently used.  Unfortunately it
doesn't quite emulate the semantics of sqlite_finalize properly, because it
doesn't cause a commit if the statement being finalized is the last "active"
statement (see https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12).  We
work around this by wrapping sqlite-finalize with our own version that ensures
sqlite-reset is called, which does The Right Thing™.

* guix/store/database.scm (sqlite-finalize): new procedure that shadows the
  sqlite-finalize from (sqlite3).
---
 guix/store/database.scm | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/guix/store/database.scm b/guix/store/database.scm
index ef52036ede..15f5791a08 100644
--- a/guix/store/database.scm
+++ b/guix/store/database.scm
@@ -130,6 +130,38 @@ transaction after it finishes."
 If FILE doesn't exist, create it and initialize it as a new database."
   (call-with-database file (lambda (db) exp ...)))
 
+(define (sqlite-finalize stmt)
+  ;; As of guile-sqlite3 0.1.0, cached statements aren't reset when
+  ;; sqlite-finalize is invoked on them (see
+  ;; https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12).  This can
+  ;; cause problems with automatically-started transactions:
+  ;;
+  ;; "An implicit transaction (a transaction that is started automatically,
+  ;; not a transaction started by BEGIN) is committed automatically when the
+  ;; last active statement finishes.  A statement finishes when its last cursor
+  ;; closes, which is guaranteed to happen when the prepared statement is
+  ;; reset or finalized.  Some statements might "finish" for the purpose of
+  ;; transaction control prior to being reset or finalized, but there is no
+  ;; guarantee of this."
+  ;;
+  ;; Thus, it's possible for an implicitly-started transaction to hang around
+  ;; until sqlite-reset is called when the cached statement is next
+  ;; used.  Because the transaction is committed automatically only when the
+  ;; *last active statement* finishes, the implicitly-started transaction may
+  ;; later be upgraded to a write transaction (!) and this non-reset statement
+  ;; will still be keeping the transaction from committing until it is next
+  ;; used or the database connection is closed.  This has the potential to make
+  ;; (exclusive) write access to the database necessary for much longer than
+  ;; it should be.
+  ;;
+  ;; (see https://www.sqlite.org/lang_transaction.html)
+  ;; To work around this, we wrap sqlite-finalize so that sqlite-reset is
+  ;; always called.  This will continue working even when the behavior is fixed
+  ;; in guile-sqlite3, since resetting twice doesn't cause any problems.  We
+  ;; can remove this once the fixed guile-sqlite3 is widespread.
+  (sqlite-reset stmt)
+  ((@ (sqlite3) sqlite-finalize) stmt))
+
 (define (last-insert-row-id db)
   ;; XXX: (sqlite3) currently lacks bindings for 'sqlite3_last_insert_rowid'.
   ;; Work around that.
-- 
2.26.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.3: 0002-gnu-guile-sqlite3-add-patch-to-fix-sqlite-finalize-b.patch --]
[-- Type: text/x-patch, Size: 6630 bytes --]

From e3cf7be4491f465d3041933596d3caad1ea64e83 Mon Sep 17 00:00:00 2001
From: Caleb Ristvedt <caleb.ristvedt@cune.org>
Date: Sun, 7 Jun 2020 22:30:41 -0500
Subject: [PATCH 2/5] gnu: guile-sqlite3: add patch to fix sqlite-finalize bug

Adds patch that fixes
https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12.  This can be
discarded once the patch is integrated into the next guile-sqlite3 release.
Note that the patch is identical to the pull request at
https://notabug.org/guile-sqlite3/guile-sqlite3/pulls/13.

* gnu/packages/patches/guile-sqlite3-reset-on-sqlite-finalize.patch: new
  patch.
* gnu/packages/guile.scm (guile-sqlite3): use it.
* gnu/local.mk (dist_patch_DATA): add it.
---
 gnu/local.mk                                  |  1 +
 gnu/packages/guile.scm                        |  3 +-
 ...ile-sqlite3-reset-on-sqlite-finalize.patch | 74 +++++++++++++++++++
 3 files changed, 77 insertions(+), 1 deletion(-)
 create mode 100644 gnu/packages/patches/guile-sqlite3-reset-on-sqlite-finalize.patch

diff --git a/gnu/local.mk b/gnu/local.mk
index ae8a2275f7..d382205a03 100644
--- a/gnu/local.mk
+++ b/gnu/local.mk
@@ -1059,6 +1059,7 @@ dist_patch_DATA =						\
   %D%/packages/patches/guile-rsvg-pkgconfig.patch		\
   %D%/packages/patches/guile-emacs-fix-configure.patch		\
   %D%/packages/patches/guile-sqlite3-fix-cross-compilation.patch \
+  %D%/packages/patches/guile-sqlite3-reset-on-sqlite-finalize.patch \
   %D%/packages/patches/gtk2-respect-GUIX_GTK2_PATH.patch	\
   %D%/packages/patches/gtk2-respect-GUIX_GTK2_IM_MODULE_FILE.patch \
   %D%/packages/patches/gtk2-theme-paths.patch			\
diff --git a/gnu/packages/guile.scm b/gnu/packages/guile.scm
index c2dc7f6d5f..63cd91a1cb 100644
--- a/gnu/packages/guile.scm
+++ b/gnu/packages/guile.scm
@@ -645,7 +645,8 @@ Guile's foreign function interface.")
                 "1nv8j7wk6b5n4p22szyi8lv8fs31rrzxhzz16gyj8r38c1fyp9qp"))
               (file-name (string-append name "-" version "-checkout"))
               (patches
-               (search-patches "guile-sqlite3-fix-cross-compilation.patch"))
+               (search-patches "guile-sqlite3-fix-cross-compilation.patch"
+                               "guile-sqlite3-reset-on-sqlite-finalize.patch"))
               (modules '((guix build utils)))
               (snippet
                '(begin
diff --git a/gnu/packages/patches/guile-sqlite3-reset-on-sqlite-finalize.patch b/gnu/packages/patches/guile-sqlite3-reset-on-sqlite-finalize.patch
new file mode 100644
index 0000000000..b6bf5325ad
--- /dev/null
+++ b/gnu/packages/patches/guile-sqlite3-reset-on-sqlite-finalize.patch
@@ -0,0 +1,74 @@
+From c59db66f9ac754b40463c6788ab9bad4f045cc92 Mon Sep 17 00:00:00 2001
+From: Caleb Ristvedt <caleb.ristvedt@cune.org>
+Date: Sun, 7 Jun 2020 18:27:44 -0500
+Subject: [PATCH] Reset statement when sqlite-finalize is called on cached
+ statement
+
+Automatically-started transactions only end when a statement finishes, which is
+normally either when sqlite-reset or sqlite-finalize is called for that
+statement.  Consequently, transactions automatically started by cached
+statements won't end until the statement is next reused by sqlite-prepare or
+sqlite-reset is called on it.  This changes sqlite-finalize so that it preserves
+the statement-finishing (and thus transaction-finishing) behavior of
+sqlite_finalize.
+---
+ sqlite3.scm.in | 43 ++++++++++++++++++++++++++++++++++---------
+ 1 file changed, 34 insertions(+), 9 deletions(-)
+
+diff --git a/sqlite3.scm.in b/sqlite3.scm.in
+index 77b5032..19241dc 100644
+--- a/sqlite3.scm.in
++++ b/sqlite3.scm.in
+@@ -274,15 +274,40 @@ statements, into DB.  The result is unspecified."
+             (dynamic-func "sqlite3_finalize" libsqlite3)
+             (list '*))))
+     (lambda (stmt)
+-      ;; Note: When STMT is cached, this is a no-op.  This ensures caching
+-      ;; actually works while still separating concerns: users can turn
+-      ;; caching on and off without having to change the rest of their code.
+-      (when (and (stmt-live? stmt)
+-                 (not (stmt-cached? stmt)))
+-        (let ((p (stmt-pointer stmt)))
+-          (sqlite-remove-statement! (stmt->db stmt) stmt)
+-          (set-stmt-live?! stmt #f)
+-          (f p))))))
++      ;; Note: When STMT is cached, this merely resets.  This ensures caching
++      ;; actually works while still separating concerns: users can turn caching
++      ;; on and off without having to change the rest of their code.
++      (if (and (stmt-live? stmt)
++               (not (stmt-cached? stmt)))
++          (let ((p (stmt-pointer stmt)))
++            (sqlite-remove-statement! (stmt->db stmt) stmt)
++            (set-stmt-live?! stmt #f)
++            (f p))
++          ;; It's necessary to reset cached statements due to the following:
++          ;;
++          ;; "An implicit transaction (a transaction that is started
++          ;; automatically, not a transaction started by BEGIN) is committed
++          ;; automatically when the last active statement finishes.  A statement
++          ;; finishes when its last cursor closes, which is guaranteed to happen
++          ;; when the prepared statement is reset or finalized.  Some statements
++          ;; might "finish" for the purpose of transaction control prior to
++          ;; being reset or finalized, but there is no guarantee of this."
++          ;;
++          ;; (see https://www.sqlite.org/lang_transaction.html)
++          ;;
++          ;; Thus, it's possible for an implicitly-started transaction to hang
++          ;; around until sqlite-reset is called when the cached statement is
++          ;; next used.  Because the transaction is committed automatically only
++          ;; when the *last active statement* finishes, the implicitly-started
++          ;; transaction may later be upgraded to a write transaction (!) and
++          ;; this non-reset statement will still be keeping the transaction from
++          ;; committing until it is next used or the database connection is
++          ;; closed.  This has the potential to make (exclusive) write access to
++          ;; the database necessary for much longer than it should be.
++          ;;
++          ;; So it's necessary to preserve the statement-finishing behavior of
++          ;; sqlite_finalize here, which we do by calling sqlite-reset.
++          (sqlite-reset stmt)))))
+ 
+ (define *stmt-map* (make-weak-key-hash-table))
+ (define (stmt->db stmt)
+-- 
+2.26.2
+
-- 
2.26.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.4: 0003-database-rewrite-query-procedures-in-terms-of-with-s.patch --]
[-- Type: text/x-patch, Size: 5878 bytes --]

From 1d0b2a6742935b812fe56ae97e34f4a35fed3348 Mon Sep 17 00:00:00 2001
From: Caleb Ristvedt <caleb.ristvedt@cune.org>
Date: Mon, 1 Jun 2020 19:21:43 -0500
Subject: [PATCH 3/5] database: rewrite query procedures in terms of
 with-statement.

Most of our queries would fail to finalize their statements properly if sqlite
returned an error during their execution.  This resolves that, and also makes
them somewhat more concise as a side-effect.

This also makes some small changes to improve certain queries where behavior
was strange or overly verbose.

* guix/store/database.scm (call-with-statement): new procedure.
  (with-statement): new macro.
  (last-insert-row-id, path-id, update-or-insert, add-references): rewrite to
  use with-statement.
  (update-or-insert): factor last-insert-row-id out of the end of both
  branches.
  (add-references): remove pointless last-insert-row-id call.

* .dir-locals.el (with-statement): add indenting information.
---
 .dir-locals.el          |  1 +
 guix/store/database.scm | 53 ++++++++++++++++++++++-------------------
 2 files changed, 30 insertions(+), 24 deletions(-)

diff --git a/.dir-locals.el b/.dir-locals.el
index dc8bc0e437..77c12f9411 100644
--- a/.dir-locals.el
+++ b/.dir-locals.el
@@ -89,6 +89,7 @@
 
    (eval . (put 'with-database 'scheme-indent-function 2))
    (eval . (put 'call-with-transaction 'scheme-indent-function 2))
+   (eval . (put 'with-statement 'scheme-indent-function 3))
 
    (eval . (put 'call-with-container 'scheme-indent-function 1))
    (eval . (put 'container-excursion 'scheme-indent-function 1))
diff --git a/guix/store/database.scm b/guix/store/database.scm
index 15f5791a08..2a943a7eb0 100644
--- a/guix/store/database.scm
+++ b/guix/store/database.scm
@@ -162,14 +162,26 @@ If FILE doesn't exist, create it and initialize it as a new database."
   (sqlite-reset stmt)
   ((@ (sqlite3) sqlite-finalize) stmt))
 
+(define (call-with-statement db sql proc)
+  (let ((stmt (sqlite-prepare db sql #:cache? #t)))
+    (dynamic-wind
+      (const #t)
+      (lambda ()
+        (proc stmt))
+      (lambda ()
+        (sqlite-finalize stmt)))))
+
+(define-syntax-rule (with-statement db sql stmt exp ...)
+  "Run EXP... with STMT bound to a prepared statement corresponding to the sql
+string SQL for DB."
+  (call-with-statement db sql
+                       (lambda (stmt) exp ...)))
+
 (define (last-insert-row-id db)
   ;; XXX: (sqlite3) currently lacks bindings for 'sqlite3_last_insert_rowid'.
   ;; Work around that.
-  (let* ((stmt   (sqlite-prepare db "SELECT last_insert_rowid();"
-                                 #:cache? #t))
-         (result (sqlite-fold cons '() stmt)))
-    (sqlite-finalize stmt)
-    (match result
+  (with-statement db "SELECT last_insert_rowid();" stmt
+    (match (sqlite-fold cons '() stmt)
       ((#(id)) id)
       (_ #f))))
 
@@ -179,13 +191,11 @@ If FILE doesn't exist, create it and initialize it as a new database."
 (define* (path-id db path)
   "If PATH exists in the 'ValidPaths' table, return its numerical
 identifier.  Otherwise, return #f."
-  (let ((stmt (sqlite-prepare db path-id-sql #:cache? #t)))
+  (with-statement db path-id-sql stmt
     (sqlite-bind-arguments stmt #:path path)
-    (let ((result (sqlite-fold cons '() stmt)))
-      (sqlite-finalize stmt)
-      (match result
-        ((#(id) . _) id)
-        (_ #f)))))
+    (match (sqlite-fold cons '() stmt)
+      ((#(id) . _) id)
+      (_ #f))))
 
 (define update-sql
   "UPDATE ValidPaths SET hash = :hash, registrationTime = :time, deriver =
@@ -202,20 +212,17 @@ and re-inserting instead of updating, which causes problems with foreign keys,
 of course. Returns the row id of the row that was modified or inserted."
   (let ((id (path-id db path)))
     (if id
-        (let ((stmt (sqlite-prepare db update-sql #:cache? #t)))
+        (with-statement db update-sql stmt
           (sqlite-bind-arguments stmt #:id id
                                  #:deriver deriver
                                  #:hash hash #:size nar-size #:time time)
-          (sqlite-fold cons '() stmt)
-          (sqlite-finalize stmt)
-          (last-insert-row-id db))
-        (let ((stmt (sqlite-prepare db insert-sql #:cache? #t)))
+          (sqlite-fold cons '() stmt))
+        (with-statement db insert-sql stmt
           (sqlite-bind-arguments stmt
                                  #:path path #:deriver deriver
                                  #:hash hash #:size nar-size #:time time)
-          (sqlite-fold cons '() stmt)             ;execute it
-          (sqlite-finalize stmt)
-          (last-insert-row-id db)))))
+          (sqlite-fold cons '() stmt)))
+    (last-insert-row-id db)))
 
 (define add-reference-sql
   "INSERT OR REPLACE INTO Refs (referrer, reference) VALUES (:referrer, :reference);")
@@ -223,15 +230,13 @@ of course. Returns the row id of the row that was modified or inserted."
 (define (add-references db referrer references)
   "REFERRER is the id of the referring store item, REFERENCES is a list
 ids of items referred to."
-  (let ((stmt (sqlite-prepare db add-reference-sql #:cache? #t)))
+  (with-statement db add-reference-sql stmt
     (for-each (lambda (reference)
                 (sqlite-reset stmt)
                 (sqlite-bind-arguments stmt #:referrer referrer
                                        #:reference reference)
-                (sqlite-fold cons '() stmt)       ;execute it
-                (last-insert-row-id db))
-              references)
-    (sqlite-finalize stmt)))
+                (sqlite-fold cons '() stmt))
+              references)))
 
 (define* (sqlite-register db #:key path (references '())
                           deriver hash nar-size time)
-- 
2.26.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.5: 0004-database-ensure-update-or-insert-is-run-within-a-tra.patch --]
[-- Type: text/x-patch, Size: 5730 bytes --]

From 00b220d1c380004a9bb128b73bb8027f2ff156f0 Mon Sep 17 00:00:00 2001
From: Caleb Ristvedt <caleb.ristvedt@cune.org>
Date: Mon, 1 Jun 2020 21:43:14 -0500
Subject: [PATCH 4/5] database: ensure update-or-insert is run within a
 transaction

update-or-insert can break if an insert occurs between when it decides whether
to update or insert and when it actually performs that operation.  Putting the
check and the update/insert operation in the same transaction ensures that the
update/insert will only succeed if no other write has occurred in the middle.

* guix/store/database.scm (call-with-savepoint): new procedure.
  (update-or-insert): use call-with-savepoint to ensure the read and the
  insert/update occur within the same transaction.
---
 .dir-locals.el          |  1 +
 guix/store/database.scm | 68 +++++++++++++++++++++++++++++++++--------
 2 files changed, 56 insertions(+), 13 deletions(-)

diff --git a/.dir-locals.el b/.dir-locals.el
index 77c12f9411..d9c81b2a48 100644
--- a/.dir-locals.el
+++ b/.dir-locals.el
@@ -90,6 +90,7 @@
    (eval . (put 'with-database 'scheme-indent-function 2))
    (eval . (put 'call-with-transaction 'scheme-indent-function 2))
    (eval . (put 'with-statement 'scheme-indent-function 3))
+   (eval . (put 'call-with-savepoint 'scheme-indent-function 1))
 
    (eval . (put 'call-with-container 'scheme-indent-function 1))
    (eval . (put 'container-excursion 'scheme-indent-function 1))
diff --git a/guix/store/database.scm b/guix/store/database.scm
index 2a943a7eb0..14aaeef176 100644
--- a/guix/store/database.scm
+++ b/guix/store/database.scm
@@ -120,6 +120,26 @@ transaction after it finishes."
           (begin
             (sqlite-exec db "rollback;")
             (throw 'sqlite-error who error description))))))
+(define* (call-with-savepoint db proc
+                              #:optional (savepoint-name "SomeSavepoint"))
+  "Call PROC after creating a savepoint named SAVEPOINT-NAME.  If PROC exits
+abnormally, rollback to that savepoint.  In all cases, remove the savepoint
+prior to returning."
+  (define (exec sql)
+    (with-statement db sql stmt
+      (sqlite-fold cons '() stmt)))
+
+  (dynamic-wind
+    (lambda ()
+      (exec (string-append "SAVEPOINT " savepoint-name ";")))
+    (lambda ()
+      (catch #t
+        proc
+        (lambda args
+          (exec (string-append "ROLLBACK TO " savepoint-name ";"))
+          (apply throw args))))
+    (lambda ()
+      (exec (string-append "RELEASE " savepoint-name ";")))))
 
 (define %default-database-file
   ;; Default location of the store database.
@@ -210,19 +230,41 @@ VALUES (:path, :hash, :time, :deriver, :size)")
 doesn't exactly have... they've got something close, but it involves deleting
 and re-inserting instead of updating, which causes problems with foreign keys,
 of course. Returns the row id of the row that was modified or inserted."
-  (let ((id (path-id db path)))
-    (if id
-        (with-statement db update-sql stmt
-          (sqlite-bind-arguments stmt #:id id
-                                 #:deriver deriver
-                                 #:hash hash #:size nar-size #:time time)
-          (sqlite-fold cons '() stmt))
-        (with-statement db insert-sql stmt
-          (sqlite-bind-arguments stmt
-                                 #:path path #:deriver deriver
-                                 #:hash hash #:size nar-size #:time time)
-          (sqlite-fold cons '() stmt)))
-    (last-insert-row-id db)))
+
+  ;; It's important that querying the path-id and the insert/update operation
+  ;; take place in the same transaction, as otherwise some other
+  ;; process/thread/fiber could register the same path between when we check
+  ;; whether it's already registered and when we register it, resulting in
+  ;; duplicate paths (which, due to a 'unique' constraint, would cause an
+  ;; exception to be thrown). With the default journaling mode this will
+  ;; prevent writes from occurring during that sensitive time, but with WAL
+  ;; mode it will instead arrange to return SQLITE_BUSY when a write occurs
+  ;; between the start of a read transaction and its upgrading to a write
+  ;; transaction (see https://sqlite.org/rescode.html#busy_snapshot).
+  ;; Experimentally, it seems this SQLITE_BUSY will ignore a busy_timeout and
+  ;; immediately return (makes sense, since waiting won't change anything).
+
+  ;; Note that when that kind of SQLITE_BUSY error is returned, it will keep
+  ;; being returned every time we try to upgrade the same outermost
+  ;; transaction to a write transaction.  So when retrying, we have to restart
+  ;; the *outermost* write transaction.  We can't inherently tell whether
+  ;; we're the outermost write transaction, so we leave the retry-handling to
+  ;; the caller.
+  (call-with-savepoint db
+    (lambda ()
+      (let ((id (path-id db path)))
+        (if id
+            (with-statement db update-sql stmt
+              (sqlite-bind-arguments stmt #:id id
+                                     #:deriver deriver
+                                     #:hash hash #:size nar-size #:time time)
+              (sqlite-fold cons '() stmt))
+            (with-statement db insert-sql stmt
+              (sqlite-bind-arguments stmt
+                                     #:path path #:deriver deriver
+                                     #:hash hash #:size nar-size #:time time)
+              (sqlite-fold cons '() stmt)))
+        (last-insert-row-id db)))))
 
 (define add-reference-sql
   "INSERT OR REPLACE INTO Refs (referrer, reference) VALUES (:referrer, :reference);")
-- 
2.26.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.6: 0005-database-separate-transaction-handling-and-retry-han.patch --]
[-- Type: text/x-patch, Size: 6216 bytes --]

From af87a556dab110ed7a6ef2fa1584778cc60be682 Mon Sep 17 00:00:00 2001
From: Caleb Ristvedt <caleb.ristvedt@cune.org>
Date: Mon, 1 Jun 2020 22:15:21 -0500
Subject: [PATCH 5/5] database: separate transaction-handling and
 retry-handling.

Previously call-with-transaction would both retry when SQLITE_BUSY errors were
thrown and do what its name suggested (start and rollback/commit a
transaction).  This changes it to do only what its name implies, which
simplifies its implementation.  Retrying is provided by the new
call-with-SQLITE_BUSY-retrying procedure.

* guix/store/database.scm (call-with-transaction): no longer restarts, new
  #:restartable? argument controls whether "begin" or "begin immediate" is
  used.
  (call-with-SQLITE_BUSY-retrying, call-with-retrying-transaction,
  call-with-retrying-savepoint): new procedures.
  (register-items): use call-with-retrying-transaction to preserve old
  behavior.

* .dir-locals.el (call-with-retrying-transaction,
  call-with-retrying-savepoint): add indentation information.
---
 .dir-locals.el          |  2 ++
 guix/store/database.scm | 69 +++++++++++++++++++++++++++++------------
 2 files changed, 51 insertions(+), 20 deletions(-)

diff --git a/.dir-locals.el b/.dir-locals.el
index d9c81b2a48..b88ec7a795 100644
--- a/.dir-locals.el
+++ b/.dir-locals.el
@@ -90,7 +90,9 @@
    (eval . (put 'with-database 'scheme-indent-function 2))
    (eval . (put 'call-with-transaction 'scheme-indent-function 2))
    (eval . (put 'with-statement 'scheme-indent-function 3))
+   (eval . (put 'call-with-retrying-transaction 'scheme-indent-function 2))
    (eval . (put 'call-with-savepoint 'scheme-indent-function 1))
+   (eval . (put 'call-with-retrying-savepoint 'scheme-indent-function 1))
 
    (eval . (put 'call-with-container 'scheme-indent-function 1))
    (eval . (put 'container-excursion 'scheme-indent-function 1))
diff --git a/guix/store/database.scm b/guix/store/database.scm
index 14aaeef176..4921e9f0e2 100644
--- a/guix/store/database.scm
+++ b/guix/store/database.scm
@@ -99,27 +99,44 @@ create it and initialize it as a new database."
 ;; XXX: missing in guile-sqlite3@0.1.0
 (define SQLITE_BUSY 5)
 
-(define (call-with-transaction db proc)
-  "Start a transaction with DB (make as many attempts as necessary) and run
-PROC.  If PROC exits abnormally, abort the transaction, otherwise commit the
-transaction after it finishes."
+(define (call-with-SQLITE_BUSY-retrying thunk)
+  "Call THUNK, retrying as long as it exits abnormally due to SQLITE_BUSY
+errors."
   (catch 'sqlite-error
+    thunk
+    (lambda (key who code errmsg)
+      (if (= code SQLITE_BUSY)
+          (call-with-SQLITE_BUSY-retrying thunk)
+          (throw key who code errmsg)))))
+
+
+
+(define* (call-with-transaction db proc #:key restartable?)
+  "Start a transaction with DB and run PROC.  If PROC exits abnormally, abort
+the transaction, otherwise commit the transaction after it finishes.
+RESTARTABLE? may be set to a non-#f value when it is safe to run PROC multiple
+times.  This may reduce contention for the database somewhat."
+  (define (exec sql)
+    (with-statement db sql stmt
+      (sqlite-fold cons '() stmt)))
+  ;; We might use begin immediate here so that if we need to retry, we figure
+  ;; that out immediately rather than because some SQLITE_BUSY exception gets
+  ;; thrown partway through PROC - in which case the part already executed
+  ;; (which may contain side-effects!) might have to be executed again for
+  ;; every retry.
+  (exec (if restartable? "begin;" "begin immediate;"))
+  (catch #t
     (lambda ()
-      ;; We use begin immediate here so that if we need to retry, we
-      ;; figure that out immediately rather than because some SQLITE_BUSY
-      ;; exception gets thrown partway through PROC - in which case the
-      ;; part already executed (which may contain side-effects!) would be
-      ;; executed again for every retry.
-      (sqlite-exec db "begin immediate;")
-      (let ((result (proc)))
-        (sqlite-exec db "commit;")
-        result))
-    (lambda (key who error description)
-      (if (= error SQLITE_BUSY)
-          (call-with-transaction db proc)
-          (begin
-            (sqlite-exec db "rollback;")
-            (throw 'sqlite-error who error description))))))
+      (let-values ((result (proc)))
+        (exec "commit;")
+        (apply values result)))
+    (lambda args
+      ;; The roll back may or may not have occurred automatically when the
+      ;; error was generated. If it has occurred, this does nothing but signal
+      ;; an error. If it hasn't occurred, this needs to be done.
+      (false-if-exception (exec "rollback;"))
+      (apply throw args))))
+
 (define* (call-with-savepoint db proc
                               #:optional (savepoint-name "SomeSavepoint"))
   "Call PROC after creating a savepoint named SAVEPOINT-NAME.  If PROC exits
@@ -141,6 +158,18 @@ prior to returning."
     (lambda ()
       (exec (string-append "RELEASE " savepoint-name ";")))))
 
+(define* (call-with-retrying-transaction db proc #:key restartable?)
+  (call-with-SQLITE_BUSY-retrying
+   (lambda ()
+     (call-with-transaction db proc #:restartable? restartable?))))
+
+(define* (call-with-retrying-savepoint db proc
+                                       #:optional (savepoint-name
+                                                   "SomeSavepoint"))
+  (call-with-SQLITE_BUSY-retrying
+   (lambda ()
+     (call-with-savepoint db proc savepoint-name))))
+
 (define %default-database-file
   ;; Default location of the store database.
   (string-append %store-database-directory "/db.sqlite"))
@@ -433,7 +462,7 @@ Write a progress report to LOG-PORT."
   (mkdir-p db-dir)
   (parameterize ((sql-schema schema))
     (with-database (string-append db-dir "/db.sqlite") db
-      (call-with-transaction db
+      (call-with-retrying-transaction db
           (lambda ()
             (let* ((prefix   (format #f "registering ~a items" (length items)))
                    (progress (progress-reporter/bar (length items)
-- 
2.26.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [bug#41658] [PATCH] fixes / improvements for (guix store database)
  2020-06-08  5:52   ` Caleb Ristvedt
@ 2020-06-09  8:42     ` Ludovic Courtès
  0 siblings, 0 replies; 6+ messages in thread
From: Ludovic Courtès @ 2020-06-09  8:42 UTC (permalink / raw)
  To: Caleb Ristvedt; +Cc: Danny Milosavljevic, 41658

Hi,

Caleb Ristvedt <caleb.ristvedt@cune.org> skribis:


[...]

>> Nice.  It would be great if you could report it upstream (Danny and/or
>> myself can then patch it directly in guile-sqlite3 and push out a
>> release) and refer to the issue from here.
>
> Reported as https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12,
> with corresponding pull request
> https://notabug.org/guile-sqlite3/guile-sqlite3/pulls/13.

Awesome, thank you.

>> We can have this patch locally in the meantime, unless it would break
>> once the new guile-sqlite3 is out.  WDYT?
>
> I've attached an updated patch series that both includes the
> guile-sqlite3 fix as a patch to the guile-sqlite3 package and adopts the
> workaround for situations where older guile-sqlite3's must be used (for
> example, when building guix from scratch on foreign distros that haven't
> incorporated the fix yet). The only changes are the addition of the
> now-second patch and fixing up some spacing in the comment in the first
> patch.

OK.

>>> * guix/store/database.scm (call-with-savepoint): new procedure.
>>>   (update-or-insert): use call-with-savepoint to ensure the read and the
>>>   insert/update occur within the same transaction.
>>
>> That’s a bit beyond my understanding, but I think you can also push this
>> one.  :-)
>
> Basically, it's like combining the body of two separate compare-and-swap
> loops into a single compare-and-swap loop. This ensures that the view is
> consistent (since if it isn't, the "compare" will fail and we'll
> retry). It addresses a problem that doesn't exist in practice yet, since
> update-or-insert is only called from within a call-with-transaction
> currently. But if someone ever wanted to call it from outside of a
> call-with-transaction, this would ensure that it still worked correctly.

Makes sense, thanks for explaining.

> Related question: does berlin export /var/guix over NFS as per
> http://hpc.guix.info/blog/2017/11/installing-guix-on-a-cluster? If so,
> that could interact poorly with our use of WAL mode:

No, it doesn’t.  (Also, in the setup described above, there’s only one
guix-daemon instance and it accesses the database via the local file
system.)

> From 614213c80a7ea15f7aab9502e6c33206ac089d05 Mon Sep 17 00:00:00 2001
> From: Caleb Ristvedt <caleb.ristvedt@cune.org>
> Date: Mon, 1 Jun 2020 18:50:07 -0500
> Subject: [PATCH 1/5] database: work around guile-sqlite3 bug preventing
>  statement reset
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> guile-sqlite3 provides statement caching, making it unnecessary for sqlite to
> keep re-preparing statements that are frequently used.  Unfortunately it
> doesn't quite emulate the semantics of sqlite_finalize properly, because it
> doesn't cause a commit if the statement being finalized is the last "active"
> statement (see https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12).  We
> work around this by wrapping sqlite-finalize with our own version that ensures
> sqlite-reset is called, which does The Right Thing™.
>
> * guix/store/database.scm (sqlite-finalize): new procedure that shadows the
>   sqlite-finalize from (sqlite3).

[...]

> +(define (sqlite-finalize stmt)
> +  ;; As of guile-sqlite3 0.1.0, cached statements aren't reset when
> +  ;; sqlite-finalize is invoked on them (see
> +  ;; https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12).  This can
> +  ;; cause problems with automatically-started transactions:

I think it’s enough to link to the upstream issue, which has the problem
well documented.

> From e3cf7be4491f465d3041933596d3caad1ea64e83 Mon Sep 17 00:00:00 2001
> From: Caleb Ristvedt <caleb.ristvedt@cune.org>
> Date: Sun, 7 Jun 2020 22:30:41 -0500
> Subject: [PATCH 2/5] gnu: guile-sqlite3: add patch to fix sqlite-finalize bug
>
> Adds patch that fixes
> https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12.  This can be
> discarded once the patch is integrated into the next guile-sqlite3 release.
> Note that the patch is identical to the pull request at
> https://notabug.org/guile-sqlite3/guile-sqlite3/pulls/13.
>
> * gnu/packages/patches/guile-sqlite3-reset-on-sqlite-finalize.patch: new
>   patch.
> * gnu/packages/guile.scm (guile-sqlite3): use it.
> * gnu/local.mk (dist_patch_DATA): add it.

I’d skip it: we have a workaround and the release may be out soon.

Danny, thoughts on getting a new release out?

The rest is still fine with me, thank you!

Ludo’.




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-06-09  8:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-02  6:31 [bug#41658] [PATCH] fixes / improvements for (guix store database) Caleb Ristvedt
2020-06-04 16:40 ` Ludovic Courtès
2020-06-04 17:00   ` Danny Milosavljevic
2020-06-05 16:19     ` Ludovic Courtès
2020-06-08  5:52   ` Caleb Ristvedt
2020-06-09  8:42     ` Ludovic Courtès

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).