* [bug#41658] [PATCH] fixes / improvements for (guix store database) @ 2020-06-02 6:31 Caleb Ristvedt 2020-06-04 16:40 ` Ludovic Courtès 0 siblings, 1 reply; 6+ messages in thread From: Caleb Ristvedt @ 2020-06-02 6:31 UTC (permalink / raw) To: 41658 [-- Attachment #1.1: Type: text/plain, Size: 471 bytes --] After some pondering about why the database might be locked so frequently, this is what I've managed to come up with. The first patch is the most likely to actually help with that, and the others mostly involve improving robustness. Ideally we'd come up with a test to quantify how much these kinds of changes affect contention over the database. For now, though, all that I can think of is seeing how this affects the systems that have had issues with that. - reepca [-- Attachment #1.2: 0001-database-work-around-guile-sqlite3-bug-preventing-st.patch --] [-- Type: text/x-patch, Size: 3299 bytes --] From cce653c590be1506e15044e445aa9805370ac759 Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt <caleb.ristvedt@cune.org> Date: Mon, 1 Jun 2020 18:50:07 -0500 Subject: [PATCH 1/4] database: work around guile-sqlite3 bug preventing statement reset MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit guile-sqlite3 provides statement caching, making it unnecessary for sqlite to keep re-preparing statements that are frequently used. Unfortunately it doesn't quite emulate the semantics of sqlite_finalize properly, because it doesn't cause a commit if the statement being finalized is the last "active" statement. We work around this by wrapping sqlite-finalize with our own version that ensures sqlite-reset is called, which does The Right Thing™. * guix/store/database.scm (sqlite-finalize): new procedure that shadows the sqlite-finalize from (sqlite3). --- guix/store/database.scm | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/guix/store/database.scm b/guix/store/database.scm index ef52036ede..d4251e580e 100644 --- a/guix/store/database.scm +++ b/guix/store/database.scm @@ -130,6 +130,36 @@ transaction after it finishes." If FILE doesn't exist, create it and initialize it as a new database." (call-with-database file (lambda (db) exp ...))) +(define (sqlite-finalize stmt) + ;; Cached statements aren't reset when sqlite-finalize is invoked on + ;; them. This can cause problems with automatically-started transactions: + ;; + ;; "An implicit transaction (a transaction that is started automatically, + ;; not a transaction started by BEGIN) is committed automatically when the + ;; last active statement finishes. A statement finishes when its last cursor + ;; closes, which is guaranteed to happen when the prepared statement is + ;; reset or finalized. Some statements might "finish" for the purpose of + ;; transaction control prior to being reset or finalized, but there is no + ;; guarantee of this." + ;; + ;; Thus, it's possible for an implicitly-started transaction to hang around + ;; until sqlite-reset is called when the cached statement is next + ;; used. Because the transaction is committed automatically only when the + ;; *last active statement* finishes, the implicitly-started transaction may + ;; later be upgraded to a write transaction (!) and this non-reset statement + ;; will still be keeping the transaction from committing until it is next + ;; used or the database connection is closed. This has the potential to make + ;; (exclusive) write access to the database necessary for much longer than + ;; it should be. + ;; + ;; (see https://www.sqlite.org/lang_transaction.html) + ;; To work around this, we wrap sqlite-finalize so that sqlite-reset is + ;; always called. This will continue working even when the behavior is fixed + ;; in guile-sqlite3, since resetting twice doesn't cause any problems. We + ;; can remove this once the fixed guile-sqlite3 is widespread. + (sqlite-reset stmt) + ((@ (sqlite3) sqlite-finalize) stmt)) + (define (last-insert-row-id db) ;; XXX: (sqlite3) currently lacks bindings for 'sqlite3_last_insert_rowid'. ;; Work around that. -- 2.26.2 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1.3: 0002-database-rewrite-query-procedures-in-terms-of-with-s.patch --] [-- Type: text/x-patch, Size: 5878 bytes --] From ee24ab21122b1c75a7d67d7062550e15e54ab62f Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt <caleb.ristvedt@cune.org> Date: Mon, 1 Jun 2020 19:21:43 -0500 Subject: [PATCH 2/4] database: rewrite query procedures in terms of with-statement. Most of our queries would fail to finalize their statements properly if sqlite returned an error during their execution. This resolves that, and also makes them somewhat more concise as a side-effect. This also makes some small changes to improve certain queries where behavior was strange or overly verbose. * guix/store/database.scm (call-with-statement): new procedure. (with-statement): new macro. (last-insert-row-id, path-id, update-or-insert, add-references): rewrite to use with-statement. (update-or-insert): factor last-insert-row-id out of the end of both branches. (add-references): remove pointless last-insert-row-id call. * .dir-locals.el (with-statement): add indenting information. --- .dir-locals.el | 1 + guix/store/database.scm | 53 ++++++++++++++++++++++------------------- 2 files changed, 30 insertions(+), 24 deletions(-) diff --git a/.dir-locals.el b/.dir-locals.el index fcde914e60..a085269e85 100644 --- a/.dir-locals.el +++ b/.dir-locals.el @@ -89,6 +89,7 @@ (eval . (put 'with-database 'scheme-indent-function 2)) (eval . (put 'call-with-transaction 'scheme-indent-function 2)) + (eval . (put 'with-statement 'scheme-indent-function 3)) (eval . (put 'call-with-container 'scheme-indent-function 1)) (eval . (put 'container-excursion 'scheme-indent-function 1)) diff --git a/guix/store/database.scm b/guix/store/database.scm index d4251e580e..2209da3df1 100644 --- a/guix/store/database.scm +++ b/guix/store/database.scm @@ -160,14 +160,26 @@ If FILE doesn't exist, create it and initialize it as a new database." (sqlite-reset stmt) ((@ (sqlite3) sqlite-finalize) stmt)) +(define (call-with-statement db sql proc) + (let ((stmt (sqlite-prepare db sql #:cache? #t))) + (dynamic-wind + (const #t) + (lambda () + (proc stmt)) + (lambda () + (sqlite-finalize stmt))))) + +(define-syntax-rule (with-statement db sql stmt exp ...) + "Run EXP... with STMT bound to a prepared statement corresponding to the sql +string SQL for DB." + (call-with-statement db sql + (lambda (stmt) exp ...))) + (define (last-insert-row-id db) ;; XXX: (sqlite3) currently lacks bindings for 'sqlite3_last_insert_rowid'. ;; Work around that. - (let* ((stmt (sqlite-prepare db "SELECT last_insert_rowid();" - #:cache? #t)) - (result (sqlite-fold cons '() stmt))) - (sqlite-finalize stmt) - (match result + (with-statement db "SELECT last_insert_rowid();" stmt + (match (sqlite-fold cons '() stmt) ((#(id)) id) (_ #f)))) @@ -177,13 +189,11 @@ If FILE doesn't exist, create it and initialize it as a new database." (define* (path-id db path) "If PATH exists in the 'ValidPaths' table, return its numerical identifier. Otherwise, return #f." - (let ((stmt (sqlite-prepare db path-id-sql #:cache? #t))) + (with-statement db path-id-sql stmt (sqlite-bind-arguments stmt #:path path) - (let ((result (sqlite-fold cons '() stmt))) - (sqlite-finalize stmt) - (match result - ((#(id) . _) id) - (_ #f))))) + (match (sqlite-fold cons '() stmt) + ((#(id) . _) id) + (_ #f)))) (define update-sql "UPDATE ValidPaths SET hash = :hash, registrationTime = :time, deriver = @@ -200,20 +210,17 @@ and re-inserting instead of updating, which causes problems with foreign keys, of course. Returns the row id of the row that was modified or inserted." (let ((id (path-id db path))) (if id - (let ((stmt (sqlite-prepare db update-sql #:cache? #t))) + (with-statement db update-sql stmt (sqlite-bind-arguments stmt #:id id #:deriver deriver #:hash hash #:size nar-size #:time time) - (sqlite-fold cons '() stmt) - (sqlite-finalize stmt) - (last-insert-row-id db)) - (let ((stmt (sqlite-prepare db insert-sql #:cache? #t))) + (sqlite-fold cons '() stmt)) + (with-statement db insert-sql stmt (sqlite-bind-arguments stmt #:path path #:deriver deriver #:hash hash #:size nar-size #:time time) - (sqlite-fold cons '() stmt) ;execute it - (sqlite-finalize stmt) - (last-insert-row-id db))))) + (sqlite-fold cons '() stmt))) + (last-insert-row-id db))) (define add-reference-sql "INSERT OR REPLACE INTO Refs (referrer, reference) VALUES (:referrer, :reference);") @@ -221,15 +228,13 @@ of course. Returns the row id of the row that was modified or inserted." (define (add-references db referrer references) "REFERRER is the id of the referring store item, REFERENCES is a list ids of items referred to." - (let ((stmt (sqlite-prepare db add-reference-sql #:cache? #t))) + (with-statement db add-reference-sql stmt (for-each (lambda (reference) (sqlite-reset stmt) (sqlite-bind-arguments stmt #:referrer referrer #:reference reference) - (sqlite-fold cons '() stmt) ;execute it - (last-insert-row-id db)) - references) - (sqlite-finalize stmt))) + (sqlite-fold cons '() stmt)) + references))) (define* (sqlite-register db #:key path (references '()) deriver hash nar-size time) -- 2.26.2 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1.4: 0003-database-ensure-update-or-insert-is-run-within-a-tra.patch --] [-- Type: text/x-patch, Size: 5730 bytes --] From 7d34c27c33aed3e8a49b9796a62a8c19d352e653 Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt <caleb.ristvedt@cune.org> Date: Mon, 1 Jun 2020 21:43:14 -0500 Subject: [PATCH 3/4] database: ensure update-or-insert is run within a transaction update-or-insert can break if an insert occurs between when it decides whether to update or insert and when it actually performs that operation. Putting the check and the update/insert operation in the same transaction ensures that the update/insert will only succeed if no other write has occurred in the middle. * guix/store/database.scm (call-with-savepoint): new procedure. (update-or-insert): use call-with-savepoint to ensure the read and the insert/update occur within the same transaction. --- .dir-locals.el | 1 + guix/store/database.scm | 68 +++++++++++++++++++++++++++++++++-------- 2 files changed, 56 insertions(+), 13 deletions(-) diff --git a/.dir-locals.el b/.dir-locals.el index a085269e85..ef25cb100a 100644 --- a/.dir-locals.el +++ b/.dir-locals.el @@ -90,6 +90,7 @@ (eval . (put 'with-database 'scheme-indent-function 2)) (eval . (put 'call-with-transaction 'scheme-indent-function 2)) (eval . (put 'with-statement 'scheme-indent-function 3)) + (eval . (put 'call-with-savepoint 'scheme-indent-function 1)) (eval . (put 'call-with-container 'scheme-indent-function 1)) (eval . (put 'container-excursion 'scheme-indent-function 1)) diff --git a/guix/store/database.scm b/guix/store/database.scm index 2209da3df1..3955c48b1f 100644 --- a/guix/store/database.scm +++ b/guix/store/database.scm @@ -120,6 +120,26 @@ transaction after it finishes." (begin (sqlite-exec db "rollback;") (throw 'sqlite-error who error description)))))) +(define* (call-with-savepoint db proc + #:optional (savepoint-name "SomeSavepoint")) + "Call PROC after creating a savepoint named SAVEPOINT-NAME. If PROC exits +abnormally, rollback to that savepoint. In all cases, remove the savepoint +prior to returning." + (define (exec sql) + (with-statement db sql stmt + (sqlite-fold cons '() stmt))) + + (dynamic-wind + (lambda () + (exec (string-append "SAVEPOINT " savepoint-name ";"))) + (lambda () + (catch #t + proc + (lambda args + (exec (string-append "ROLLBACK TO " savepoint-name ";")) + (apply throw args)))) + (lambda () + (exec (string-append "RELEASE " savepoint-name ";"))))) (define %default-database-file ;; Default location of the store database. @@ -208,19 +228,41 @@ VALUES (:path, :hash, :time, :deriver, :size)") doesn't exactly have... they've got something close, but it involves deleting and re-inserting instead of updating, which causes problems with foreign keys, of course. Returns the row id of the row that was modified or inserted." - (let ((id (path-id db path))) - (if id - (with-statement db update-sql stmt - (sqlite-bind-arguments stmt #:id id - #:deriver deriver - #:hash hash #:size nar-size #:time time) - (sqlite-fold cons '() stmt)) - (with-statement db insert-sql stmt - (sqlite-bind-arguments stmt - #:path path #:deriver deriver - #:hash hash #:size nar-size #:time time) - (sqlite-fold cons '() stmt))) - (last-insert-row-id db))) + + ;; It's important that querying the path-id and the insert/update operation + ;; take place in the same transaction, as otherwise some other + ;; process/thread/fiber could register the same path between when we check + ;; whether it's already registered and when we register it, resulting in + ;; duplicate paths (which, due to a 'unique' constraint, would cause an + ;; exception to be thrown). With the default journaling mode this will + ;; prevent writes from occurring during that sensitive time, but with WAL + ;; mode it will instead arrange to return SQLITE_BUSY when a write occurs + ;; between the start of a read transaction and its upgrading to a write + ;; transaction (see https://sqlite.org/rescode.html#busy_snapshot). + ;; Experimentally, it seems this SQLITE_BUSY will ignore a busy_timeout and + ;; immediately return (makes sense, since waiting won't change anything). + + ;; Note that when that kind of SQLITE_BUSY error is returned, it will keep + ;; being returned every time we try to upgrade the same outermost + ;; transaction to a write transaction. So when retrying, we have to restart + ;; the *outermost* write transaction. We can't inherently tell whether + ;; we're the outermost write transaction, so we leave the retry-handling to + ;; the caller. + (call-with-savepoint db + (lambda () + (let ((id (path-id db path))) + (if id + (with-statement db update-sql stmt + (sqlite-bind-arguments stmt #:id id + #:deriver deriver + #:hash hash #:size nar-size #:time time) + (sqlite-fold cons '() stmt)) + (with-statement db insert-sql stmt + (sqlite-bind-arguments stmt + #:path path #:deriver deriver + #:hash hash #:size nar-size #:time time) + (sqlite-fold cons '() stmt))) + (last-insert-row-id db))))) (define add-reference-sql "INSERT OR REPLACE INTO Refs (referrer, reference) VALUES (:referrer, :reference);") -- 2.26.2 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1.5: 0004-database-separate-transaction-handling-and-retry-han.patch --] [-- Type: text/x-patch, Size: 6216 bytes --] From e30271728dfb23324c981d226c752b17689c9eef Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt <caleb.ristvedt@cune.org> Date: Mon, 1 Jun 2020 22:15:21 -0500 Subject: [PATCH 4/4] database: separate transaction-handling and retry-handling. Previously call-with-transaction would both retry when SQLITE_BUSY errors were thrown and do what its name suggested (start and rollback/commit a transaction). This changes it to do only what its name implies, which simplifies its implementation. Retrying is provided by the new call-with-SQLITE_BUSY-retrying procedure. * guix/store/database.scm (call-with-transaction): no longer restarts, new #:restartable? argument controls whether "begin" or "begin immediate" is used. (call-with-SQLITE_BUSY-retrying, call-with-retrying-transaction, call-with-retrying-savepoint): new procedures. (register-items): use call-with-retrying-transaction to preserve old behavior. * .dir-locals.el (call-with-retrying-transaction, call-with-retrying-savepoint): add indentation information. --- .dir-locals.el | 2 ++ guix/store/database.scm | 69 +++++++++++++++++++++++++++++------------ 2 files changed, 51 insertions(+), 20 deletions(-) diff --git a/.dir-locals.el b/.dir-locals.el index ef25cb100a..e9dccd0511 100644 --- a/.dir-locals.el +++ b/.dir-locals.el @@ -90,7 +90,9 @@ (eval . (put 'with-database 'scheme-indent-function 2)) (eval . (put 'call-with-transaction 'scheme-indent-function 2)) (eval . (put 'with-statement 'scheme-indent-function 3)) + (eval . (put 'call-with-retrying-transaction 'scheme-indent-function 2)) (eval . (put 'call-with-savepoint 'scheme-indent-function 1)) + (eval . (put 'call-with-retrying-savepoint 'scheme-indent-function 1)) (eval . (put 'call-with-container 'scheme-indent-function 1)) (eval . (put 'container-excursion 'scheme-indent-function 1)) diff --git a/guix/store/database.scm b/guix/store/database.scm index 3955c48b1f..2a78379dac 100644 --- a/guix/store/database.scm +++ b/guix/store/database.scm @@ -99,27 +99,44 @@ create it and initialize it as a new database." ;; XXX: missing in guile-sqlite3@0.1.0 (define SQLITE_BUSY 5) -(define (call-with-transaction db proc) - "Start a transaction with DB (make as many attempts as necessary) and run -PROC. If PROC exits abnormally, abort the transaction, otherwise commit the -transaction after it finishes." +(define (call-with-SQLITE_BUSY-retrying thunk) + "Call THUNK, retrying as long as it exits abnormally due to SQLITE_BUSY +errors." (catch 'sqlite-error + thunk + (lambda (key who code errmsg) + (if (= code SQLITE_BUSY) + (call-with-SQLITE_BUSY-retrying thunk) + (throw key who code errmsg))))) + + + +(define* (call-with-transaction db proc #:key restartable?) + "Start a transaction with DB and run PROC. If PROC exits abnormally, abort +the transaction, otherwise commit the transaction after it finishes. +RESTARTABLE? may be set to a non-#f value when it is safe to run PROC multiple +times. This may reduce contention for the database somewhat." + (define (exec sql) + (with-statement db sql stmt + (sqlite-fold cons '() stmt))) + ;; We might use begin immediate here so that if we need to retry, we figure + ;; that out immediately rather than because some SQLITE_BUSY exception gets + ;; thrown partway through PROC - in which case the part already executed + ;; (which may contain side-effects!) might have to be executed again for + ;; every retry. + (exec (if restartable? "begin;" "begin immediate;")) + (catch #t (lambda () - ;; We use begin immediate here so that if we need to retry, we - ;; figure that out immediately rather than because some SQLITE_BUSY - ;; exception gets thrown partway through PROC - in which case the - ;; part already executed (which may contain side-effects!) would be - ;; executed again for every retry. - (sqlite-exec db "begin immediate;") - (let ((result (proc))) - (sqlite-exec db "commit;") - result)) - (lambda (key who error description) - (if (= error SQLITE_BUSY) - (call-with-transaction db proc) - (begin - (sqlite-exec db "rollback;") - (throw 'sqlite-error who error description)))))) + (let-values ((result (proc))) + (exec "commit;") + (apply values result))) + (lambda args + ;; The roll back may or may not have occurred automatically when the + ;; error was generated. If it has occurred, this does nothing but signal + ;; an error. If it hasn't occurred, this needs to be done. + (false-if-exception (exec "rollback;")) + (apply throw args)))) + (define* (call-with-savepoint db proc #:optional (savepoint-name "SomeSavepoint")) "Call PROC after creating a savepoint named SAVEPOINT-NAME. If PROC exits @@ -141,6 +158,18 @@ prior to returning." (lambda () (exec (string-append "RELEASE " savepoint-name ";"))))) +(define* (call-with-retrying-transaction db proc #:key restartable?) + (call-with-SQLITE_BUSY-retrying + (lambda () + (call-with-transaction db proc #:restartable? restartable?)))) + +(define* (call-with-retrying-savepoint db proc + #:optional (savepoint-name + "SomeSavepoint")) + (call-with-SQLITE_BUSY-retrying + (lambda () + (call-with-savepoint db proc savepoint-name)))) + (define %default-database-file ;; Default location of the store database. (string-append %store-database-directory "/db.sqlite")) @@ -431,7 +460,7 @@ Write a progress report to LOG-PORT." (mkdir-p db-dir) (parameterize ((sql-schema schema)) (with-database (string-append db-dir "/db.sqlite") db - (call-with-transaction db + (call-with-retrying-transaction db (lambda () (let* ((prefix (format #f "registering ~a items" (length items))) (progress (progress-reporter/bar (length items) -- 2.26.2 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 487 bytes --] ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [bug#41658] [PATCH] fixes / improvements for (guix store database) 2020-06-02 6:31 [bug#41658] [PATCH] fixes / improvements for (guix store database) Caleb Ristvedt @ 2020-06-04 16:40 ` Ludovic Courtès 2020-06-04 17:00 ` Danny Milosavljevic 2020-06-08 5:52 ` Caleb Ristvedt 0 siblings, 2 replies; 6+ messages in thread From: Ludovic Courtès @ 2020-06-04 16:40 UTC (permalink / raw) To: Caleb Ristvedt; +Cc: 41658 Hi, Thanks for the thorough investigation and for the patches! Caleb Ristvedt <caleb.ristvedt@cune.org> skribis: > From cce653c590be1506e15044e445aa9805370ac759 Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt <caleb.ristvedt@cune.org> > Date: Mon, 1 Jun 2020 18:50:07 -0500 > Subject: [PATCH 1/4] database: work around guile-sqlite3 bug preventing > statement reset > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > guile-sqlite3 provides statement caching, making it unnecessary for sqlite to > keep re-preparing statements that are frequently used. Unfortunately it > doesn't quite emulate the semantics of sqlite_finalize properly, because it > doesn't cause a commit if the statement being finalized is the last "active" > statement. We work around this by wrapping sqlite-finalize with our own > version that ensures sqlite-reset is called, which does The Right Thing™. > > * guix/store/database.scm (sqlite-finalize): new procedure that shadows the > sqlite-finalize from (sqlite3). Nice. It would be great if you could report it upstream (Danny and/or myself can then patch it directly in guile-sqlite3 and push out a release) and refer to the issue from here. We can have this patch locally in the meantime, unless it would break once the new guile-sqlite3 is out. WDYT? > From ee24ab21122b1c75a7d67d7062550e15e54ab62f Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt <caleb.ristvedt@cune.org> > Date: Mon, 1 Jun 2020 19:21:43 -0500 > Subject: [PATCH 2/4] database: rewrite query procedures in terms of > with-statement. > > Most of our queries would fail to finalize their statements properly if sqlite > returned an error during their execution. This resolves that, and also makes > them somewhat more concise as a side-effect. > > This also makes some small changes to improve certain queries where behavior > was strange or overly verbose. > > * guix/store/database.scm (call-with-statement): new procedure. > (with-statement): new macro. > (last-insert-row-id, path-id, update-or-insert, add-references): rewrite to > use with-statement. > (update-or-insert): factor last-insert-row-id out of the end of both > branches. > (add-references): remove pointless last-insert-row-id call. > > * .dir-locals.el (with-statement): add indenting information. LGTM! > From 7d34c27c33aed3e8a49b9796a62a8c19d352e653 Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt <caleb.ristvedt@cune.org> > Date: Mon, 1 Jun 2020 21:43:14 -0500 > Subject: [PATCH 3/4] database: ensure update-or-insert is run within a > transaction > > update-or-insert can break if an insert occurs between when it decides whether > to update or insert and when it actually performs that operation. Putting the > check and the update/insert operation in the same transaction ensures that the > update/insert will only succeed if no other write has occurred in the middle. > > * guix/store/database.scm (call-with-savepoint): new procedure. > (update-or-insert): use call-with-savepoint to ensure the read and the > insert/update occur within the same transaction. That’s a bit beyond my understanding, but I think you can also push this one. :-) Make sure “make check TESTS=tests/store-database.scm” is still happy. Thanks a lot! Ludo’. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [bug#41658] [PATCH] fixes / improvements for (guix store database) 2020-06-04 16:40 ` Ludovic Courtès @ 2020-06-04 17:00 ` Danny Milosavljevic 2020-06-05 16:19 ` Ludovic Courtès 2020-06-08 5:52 ` Caleb Ristvedt 1 sibling, 1 reply; 6+ messages in thread From: Danny Milosavljevic @ 2020-06-04 17:00 UTC (permalink / raw) To: Ludovic Courtès, Caleb Ristvedt; +Cc: 41658 [-- Attachment #1: Type: text/plain, Size: 2983 bytes --] Hi Ludo, Hi Caleb, On Thu, 04 Jun 2020 18:40:35 +0200 Ludovic Courtès <ludo@gnu.org> wrote: > Nice. It would be great if you could report it upstream (Danny and/or > myself can then patch it directly in guile-sqlite3 and push out a > release) and refer to the issue from here. I agree. It's easy to change sqlite-finalize in guile-sqlite3 to call sqlite-reset, basically just adapt (define sqlite-finalize (let ((f (pointer->procedure int (dynamic-func "sqlite3_finalize" libsqlite3) (list '*)))) (lambda (stmt) ;; Note: When STMT is cached, this is a no-op. This ensures caching ;; actually works while still separating concerns: users can turn ;; caching on and off without having to change the rest of their code. (when (and (stmt-live? stmt) (not (stmt-cached? stmt))) (let ((p (stmt-pointer stmt))) (sqlite-remove-statement! (stmt->db stmt) stmt) (set-stmt-live?! stmt #f) (f p)))))) so that it calls sqlite-reset in the "when"'s new "else" branch there. (we could also always call sqlite3_reset on sqlite-finalize anyway, it wouldn't hurt but it wouldn't help either) I agree that sqlite-finalize should model sqlite's finalization behavior as much as possible. Also, the comment about this being a no-op is not true then anymore. We should definitely also pick up Caleb's comment upstream: + ;; Cached statements aren't reset when sqlite-finalize is invoked on + ;; them. This can cause problems with automatically-started transactions: + ;; + ;; "An implicit transaction (a transaction that is started automatically, + ;; not a transaction started by BEGIN) is committed automatically when the + ;; last active statement finishes. A statement finishes when its last cursor + ;; closes, which is guaranteed to happen when the prepared statement is + ;; reset or finalized. Some statements might "finish" for the purpose of + ;; transaction control prior to being reset or finalized, but there is no + ;; guarantee of this." + ;; + ;; Thus, it's possible for an implicitly-started transaction to hang around + ;; until sqlite-reset is called when the cached statement is next + ;; used. Because the transaction is committed automatically only when the + ;; *last active statement* finishes, the implicitly-started transaction may + ;; later be upgraded to a write transaction (!) and this non-reset statement + ;; will still be keeping the transaction from committing until it is next + ;; used or the database connection is closed. This has the potential to make + ;; (exclusive) write access to the database necessary for much longer than + ;; it should be. + ;; + ;; (see https://www.sqlite.org/lang_transaction.html) @Caleb: Could you file an issue at https://notabug.org/guile-sqlite3/guile-sqlite3/issues and pull request so this is auditable? [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* [bug#41658] [PATCH] fixes / improvements for (guix store database) 2020-06-04 17:00 ` Danny Milosavljevic @ 2020-06-05 16:19 ` Ludovic Courtès 0 siblings, 0 replies; 6+ messages in thread From: Ludovic Courtès @ 2020-06-05 16:19 UTC (permalink / raw) To: Danny Milosavljevic; +Cc: Caleb Ristvedt, 41658 Hi Danny! Danny Milosavljevic <dannym@scratchpost.org> skribis: > I agree. It's easy to change sqlite-finalize in guile-sqlite3 to > call sqlite-reset, basically just adapt [...] > @Caleb: > > Could you file an issue at https://notabug.org/guile-sqlite3/guile-sqlite3/issues > and pull request so this is auditable? Agreed. Danny, once this is merged upstream, could you tag a new release? There are a few other useful improvements in there. Thanks, Ludo’. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [bug#41658] [PATCH] fixes / improvements for (guix store database) 2020-06-04 16:40 ` Ludovic Courtès 2020-06-04 17:00 ` Danny Milosavljevic @ 2020-06-08 5:52 ` Caleb Ristvedt 2020-06-09 8:42 ` Ludovic Courtès 1 sibling, 1 reply; 6+ messages in thread From: Caleb Ristvedt @ 2020-06-08 5:52 UTC (permalink / raw) To: Ludovic Courtès; +Cc: Danny Milosavljevic, 41658 [-- Attachment #1.1: Type: text/plain, Size: 3908 bytes --] Ludovic Courtès <ludo@gnu.org> writes: > Hi, > > Thanks for the thorough investigation and for the patches! > > Caleb Ristvedt <caleb.ristvedt@cune.org> skribis: > >> From cce653c590be1506e15044e445aa9805370ac759 Mon Sep 17 00:00:00 2001 >> From: Caleb Ristvedt <caleb.ristvedt@cune.org> >> Date: Mon, 1 Jun 2020 18:50:07 -0500 >> Subject: [PATCH 1/4] database: work around guile-sqlite3 bug preventing >> statement reset >> MIME-Version: 1.0 >> Content-Type: text/plain; charset=UTF-8 >> Content-Transfer-Encoding: 8bit >> >> guile-sqlite3 provides statement caching, making it unnecessary for sqlite to >> keep re-preparing statements that are frequently used. Unfortunately it >> doesn't quite emulate the semantics of sqlite_finalize properly, because it >> doesn't cause a commit if the statement being finalized is the last "active" >> statement. We work around this by wrapping sqlite-finalize with our own >> version that ensures sqlite-reset is called, which does The Right Thing™. >> >> * guix/store/database.scm (sqlite-finalize): new procedure that shadows the >> sqlite-finalize from (sqlite3). > > Nice. It would be great if you could report it upstream (Danny and/or > myself can then patch it directly in guile-sqlite3 and push out a > release) and refer to the issue from here. Reported as https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12, with corresponding pull request https://notabug.org/guile-sqlite3/guile-sqlite3/pulls/13. > We can have this patch locally in the meantime, unless it would break > once the new guile-sqlite3 is out. WDYT? I've attached an updated patch series that both includes the guile-sqlite3 fix as a patch to the guile-sqlite3 package and adopts the workaround for situations where older guile-sqlite3's must be used (for example, when building guix from scratch on foreign distros that haven't incorporated the fix yet). The only changes are the addition of the now-second patch and fixing up some spacing in the comment in the first patch. >> From 7d34c27c33aed3e8a49b9796a62a8c19d352e653 Mon Sep 17 00:00:00 2001 >> From: Caleb Ristvedt <caleb.ristvedt@cune.org> >> Date: Mon, 1 Jun 2020 21:43:14 -0500 >> Subject: [PATCH 3/4] database: ensure update-or-insert is run within a >> transaction >> >> update-or-insert can break if an insert occurs between when it decides whether >> to update or insert and when it actually performs that operation. Putting the >> check and the update/insert operation in the same transaction ensures that the >> update/insert will only succeed if no other write has occurred in the middle. >> >> * guix/store/database.scm (call-with-savepoint): new procedure. >> (update-or-insert): use call-with-savepoint to ensure the read and the >> insert/update occur within the same transaction. > > That’s a bit beyond my understanding, but I think you can also push this > one. :-) Basically, it's like combining the body of two separate compare-and-swap loops into a single compare-and-swap loop. This ensures that the view is consistent (since if it isn't, the "compare" will fail and we'll retry). It addresses a problem that doesn't exist in practice yet, since update-or-insert is only called from within a call-with-transaction currently. But if someone ever wanted to call it from outside of a call-with-transaction, this would ensure that it still worked correctly. > Make sure “make check TESTS=tests/store-database.scm” is still happy. Works on my system. Related question: does berlin export /var/guix over NFS as per http://hpc.guix.info/blog/2017/11/installing-guix-on-a-cluster? If so, that could interact poorly with our use of WAL mode: "All processes using a database must be on the same host computer; WAL does not work over a network filesystem." - https://sqlite.org/wal.html. - reepca [-- Attachment #1.2: 0001-database-work-around-guile-sqlite3-bug-preventing-st.patch --] [-- Type: text/x-patch, Size: 3478 bytes --] From 614213c80a7ea15f7aab9502e6c33206ac089d05 Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt <caleb.ristvedt@cune.org> Date: Mon, 1 Jun 2020 18:50:07 -0500 Subject: [PATCH 1/5] database: work around guile-sqlite3 bug preventing statement reset MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit guile-sqlite3 provides statement caching, making it unnecessary for sqlite to keep re-preparing statements that are frequently used. Unfortunately it doesn't quite emulate the semantics of sqlite_finalize properly, because it doesn't cause a commit if the statement being finalized is the last "active" statement (see https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12). We work around this by wrapping sqlite-finalize with our own version that ensures sqlite-reset is called, which does The Right Thing™. * guix/store/database.scm (sqlite-finalize): new procedure that shadows the sqlite-finalize from (sqlite3). --- guix/store/database.scm | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/guix/store/database.scm b/guix/store/database.scm index ef52036ede..15f5791a08 100644 --- a/guix/store/database.scm +++ b/guix/store/database.scm @@ -130,6 +130,38 @@ transaction after it finishes." If FILE doesn't exist, create it and initialize it as a new database." (call-with-database file (lambda (db) exp ...))) +(define (sqlite-finalize stmt) + ;; As of guile-sqlite3 0.1.0, cached statements aren't reset when + ;; sqlite-finalize is invoked on them (see + ;; https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12). This can + ;; cause problems with automatically-started transactions: + ;; + ;; "An implicit transaction (a transaction that is started automatically, + ;; not a transaction started by BEGIN) is committed automatically when the + ;; last active statement finishes. A statement finishes when its last cursor + ;; closes, which is guaranteed to happen when the prepared statement is + ;; reset or finalized. Some statements might "finish" for the purpose of + ;; transaction control prior to being reset or finalized, but there is no + ;; guarantee of this." + ;; + ;; Thus, it's possible for an implicitly-started transaction to hang around + ;; until sqlite-reset is called when the cached statement is next + ;; used. Because the transaction is committed automatically only when the + ;; *last active statement* finishes, the implicitly-started transaction may + ;; later be upgraded to a write transaction (!) and this non-reset statement + ;; will still be keeping the transaction from committing until it is next + ;; used or the database connection is closed. This has the potential to make + ;; (exclusive) write access to the database necessary for much longer than + ;; it should be. + ;; + ;; (see https://www.sqlite.org/lang_transaction.html) + ;; To work around this, we wrap sqlite-finalize so that sqlite-reset is + ;; always called. This will continue working even when the behavior is fixed + ;; in guile-sqlite3, since resetting twice doesn't cause any problems. We + ;; can remove this once the fixed guile-sqlite3 is widespread. + (sqlite-reset stmt) + ((@ (sqlite3) sqlite-finalize) stmt)) + (define (last-insert-row-id db) ;; XXX: (sqlite3) currently lacks bindings for 'sqlite3_last_insert_rowid'. ;; Work around that. -- 2.26.2 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1.3: 0002-gnu-guile-sqlite3-add-patch-to-fix-sqlite-finalize-b.patch --] [-- Type: text/x-patch, Size: 6630 bytes --] From e3cf7be4491f465d3041933596d3caad1ea64e83 Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt <caleb.ristvedt@cune.org> Date: Sun, 7 Jun 2020 22:30:41 -0500 Subject: [PATCH 2/5] gnu: guile-sqlite3: add patch to fix sqlite-finalize bug Adds patch that fixes https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12. This can be discarded once the patch is integrated into the next guile-sqlite3 release. Note that the patch is identical to the pull request at https://notabug.org/guile-sqlite3/guile-sqlite3/pulls/13. * gnu/packages/patches/guile-sqlite3-reset-on-sqlite-finalize.patch: new patch. * gnu/packages/guile.scm (guile-sqlite3): use it. * gnu/local.mk (dist_patch_DATA): add it. --- gnu/local.mk | 1 + gnu/packages/guile.scm | 3 +- ...ile-sqlite3-reset-on-sqlite-finalize.patch | 74 +++++++++++++++++++ 3 files changed, 77 insertions(+), 1 deletion(-) create mode 100644 gnu/packages/patches/guile-sqlite3-reset-on-sqlite-finalize.patch diff --git a/gnu/local.mk b/gnu/local.mk index ae8a2275f7..d382205a03 100644 --- a/gnu/local.mk +++ b/gnu/local.mk @@ -1059,6 +1059,7 @@ dist_patch_DATA = \ %D%/packages/patches/guile-rsvg-pkgconfig.patch \ %D%/packages/patches/guile-emacs-fix-configure.patch \ %D%/packages/patches/guile-sqlite3-fix-cross-compilation.patch \ + %D%/packages/patches/guile-sqlite3-reset-on-sqlite-finalize.patch \ %D%/packages/patches/gtk2-respect-GUIX_GTK2_PATH.patch \ %D%/packages/patches/gtk2-respect-GUIX_GTK2_IM_MODULE_FILE.patch \ %D%/packages/patches/gtk2-theme-paths.patch \ diff --git a/gnu/packages/guile.scm b/gnu/packages/guile.scm index c2dc7f6d5f..63cd91a1cb 100644 --- a/gnu/packages/guile.scm +++ b/gnu/packages/guile.scm @@ -645,7 +645,8 @@ Guile's foreign function interface.") "1nv8j7wk6b5n4p22szyi8lv8fs31rrzxhzz16gyj8r38c1fyp9qp")) (file-name (string-append name "-" version "-checkout")) (patches - (search-patches "guile-sqlite3-fix-cross-compilation.patch")) + (search-patches "guile-sqlite3-fix-cross-compilation.patch" + "guile-sqlite3-reset-on-sqlite-finalize.patch")) (modules '((guix build utils))) (snippet '(begin diff --git a/gnu/packages/patches/guile-sqlite3-reset-on-sqlite-finalize.patch b/gnu/packages/patches/guile-sqlite3-reset-on-sqlite-finalize.patch new file mode 100644 index 0000000000..b6bf5325ad --- /dev/null +++ b/gnu/packages/patches/guile-sqlite3-reset-on-sqlite-finalize.patch @@ -0,0 +1,74 @@ +From c59db66f9ac754b40463c6788ab9bad4f045cc92 Mon Sep 17 00:00:00 2001 +From: Caleb Ristvedt <caleb.ristvedt@cune.org> +Date: Sun, 7 Jun 2020 18:27:44 -0500 +Subject: [PATCH] Reset statement when sqlite-finalize is called on cached + statement + +Automatically-started transactions only end when a statement finishes, which is +normally either when sqlite-reset or sqlite-finalize is called for that +statement. Consequently, transactions automatically started by cached +statements won't end until the statement is next reused by sqlite-prepare or +sqlite-reset is called on it. This changes sqlite-finalize so that it preserves +the statement-finishing (and thus transaction-finishing) behavior of +sqlite_finalize. +--- + sqlite3.scm.in | 43 ++++++++++++++++++++++++++++++++++--------- + 1 file changed, 34 insertions(+), 9 deletions(-) + +diff --git a/sqlite3.scm.in b/sqlite3.scm.in +index 77b5032..19241dc 100644 +--- a/sqlite3.scm.in ++++ b/sqlite3.scm.in +@@ -274,15 +274,40 @@ statements, into DB. The result is unspecified." + (dynamic-func "sqlite3_finalize" libsqlite3) + (list '*)))) + (lambda (stmt) +- ;; Note: When STMT is cached, this is a no-op. This ensures caching +- ;; actually works while still separating concerns: users can turn +- ;; caching on and off without having to change the rest of their code. +- (when (and (stmt-live? stmt) +- (not (stmt-cached? stmt))) +- (let ((p (stmt-pointer stmt))) +- (sqlite-remove-statement! (stmt->db stmt) stmt) +- (set-stmt-live?! stmt #f) +- (f p)))))) ++ ;; Note: When STMT is cached, this merely resets. This ensures caching ++ ;; actually works while still separating concerns: users can turn caching ++ ;; on and off without having to change the rest of their code. ++ (if (and (stmt-live? stmt) ++ (not (stmt-cached? stmt))) ++ (let ((p (stmt-pointer stmt))) ++ (sqlite-remove-statement! (stmt->db stmt) stmt) ++ (set-stmt-live?! stmt #f) ++ (f p)) ++ ;; It's necessary to reset cached statements due to the following: ++ ;; ++ ;; "An implicit transaction (a transaction that is started ++ ;; automatically, not a transaction started by BEGIN) is committed ++ ;; automatically when the last active statement finishes. A statement ++ ;; finishes when its last cursor closes, which is guaranteed to happen ++ ;; when the prepared statement is reset or finalized. Some statements ++ ;; might "finish" for the purpose of transaction control prior to ++ ;; being reset or finalized, but there is no guarantee of this." ++ ;; ++ ;; (see https://www.sqlite.org/lang_transaction.html) ++ ;; ++ ;; Thus, it's possible for an implicitly-started transaction to hang ++ ;; around until sqlite-reset is called when the cached statement is ++ ;; next used. Because the transaction is committed automatically only ++ ;; when the *last active statement* finishes, the implicitly-started ++ ;; transaction may later be upgraded to a write transaction (!) and ++ ;; this non-reset statement will still be keeping the transaction from ++ ;; committing until it is next used or the database connection is ++ ;; closed. This has the potential to make (exclusive) write access to ++ ;; the database necessary for much longer than it should be. ++ ;; ++ ;; So it's necessary to preserve the statement-finishing behavior of ++ ;; sqlite_finalize here, which we do by calling sqlite-reset. ++ (sqlite-reset stmt))))) + + (define *stmt-map* (make-weak-key-hash-table)) + (define (stmt->db stmt) +-- +2.26.2 + -- 2.26.2 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1.4: 0003-database-rewrite-query-procedures-in-terms-of-with-s.patch --] [-- Type: text/x-patch, Size: 5878 bytes --] From 1d0b2a6742935b812fe56ae97e34f4a35fed3348 Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt <caleb.ristvedt@cune.org> Date: Mon, 1 Jun 2020 19:21:43 -0500 Subject: [PATCH 3/5] database: rewrite query procedures in terms of with-statement. Most of our queries would fail to finalize their statements properly if sqlite returned an error during their execution. This resolves that, and also makes them somewhat more concise as a side-effect. This also makes some small changes to improve certain queries where behavior was strange or overly verbose. * guix/store/database.scm (call-with-statement): new procedure. (with-statement): new macro. (last-insert-row-id, path-id, update-or-insert, add-references): rewrite to use with-statement. (update-or-insert): factor last-insert-row-id out of the end of both branches. (add-references): remove pointless last-insert-row-id call. * .dir-locals.el (with-statement): add indenting information. --- .dir-locals.el | 1 + guix/store/database.scm | 53 ++++++++++++++++++++++------------------- 2 files changed, 30 insertions(+), 24 deletions(-) diff --git a/.dir-locals.el b/.dir-locals.el index dc8bc0e437..77c12f9411 100644 --- a/.dir-locals.el +++ b/.dir-locals.el @@ -89,6 +89,7 @@ (eval . (put 'with-database 'scheme-indent-function 2)) (eval . (put 'call-with-transaction 'scheme-indent-function 2)) + (eval . (put 'with-statement 'scheme-indent-function 3)) (eval . (put 'call-with-container 'scheme-indent-function 1)) (eval . (put 'container-excursion 'scheme-indent-function 1)) diff --git a/guix/store/database.scm b/guix/store/database.scm index 15f5791a08..2a943a7eb0 100644 --- a/guix/store/database.scm +++ b/guix/store/database.scm @@ -162,14 +162,26 @@ If FILE doesn't exist, create it and initialize it as a new database." (sqlite-reset stmt) ((@ (sqlite3) sqlite-finalize) stmt)) +(define (call-with-statement db sql proc) + (let ((stmt (sqlite-prepare db sql #:cache? #t))) + (dynamic-wind + (const #t) + (lambda () + (proc stmt)) + (lambda () + (sqlite-finalize stmt))))) + +(define-syntax-rule (with-statement db sql stmt exp ...) + "Run EXP... with STMT bound to a prepared statement corresponding to the sql +string SQL for DB." + (call-with-statement db sql + (lambda (stmt) exp ...))) + (define (last-insert-row-id db) ;; XXX: (sqlite3) currently lacks bindings for 'sqlite3_last_insert_rowid'. ;; Work around that. - (let* ((stmt (sqlite-prepare db "SELECT last_insert_rowid();" - #:cache? #t)) - (result (sqlite-fold cons '() stmt))) - (sqlite-finalize stmt) - (match result + (with-statement db "SELECT last_insert_rowid();" stmt + (match (sqlite-fold cons '() stmt) ((#(id)) id) (_ #f)))) @@ -179,13 +191,11 @@ If FILE doesn't exist, create it and initialize it as a new database." (define* (path-id db path) "If PATH exists in the 'ValidPaths' table, return its numerical identifier. Otherwise, return #f." - (let ((stmt (sqlite-prepare db path-id-sql #:cache? #t))) + (with-statement db path-id-sql stmt (sqlite-bind-arguments stmt #:path path) - (let ((result (sqlite-fold cons '() stmt))) - (sqlite-finalize stmt) - (match result - ((#(id) . _) id) - (_ #f))))) + (match (sqlite-fold cons '() stmt) + ((#(id) . _) id) + (_ #f)))) (define update-sql "UPDATE ValidPaths SET hash = :hash, registrationTime = :time, deriver = @@ -202,20 +212,17 @@ and re-inserting instead of updating, which causes problems with foreign keys, of course. Returns the row id of the row that was modified or inserted." (let ((id (path-id db path))) (if id - (let ((stmt (sqlite-prepare db update-sql #:cache? #t))) + (with-statement db update-sql stmt (sqlite-bind-arguments stmt #:id id #:deriver deriver #:hash hash #:size nar-size #:time time) - (sqlite-fold cons '() stmt) - (sqlite-finalize stmt) - (last-insert-row-id db)) - (let ((stmt (sqlite-prepare db insert-sql #:cache? #t))) + (sqlite-fold cons '() stmt)) + (with-statement db insert-sql stmt (sqlite-bind-arguments stmt #:path path #:deriver deriver #:hash hash #:size nar-size #:time time) - (sqlite-fold cons '() stmt) ;execute it - (sqlite-finalize stmt) - (last-insert-row-id db))))) + (sqlite-fold cons '() stmt))) + (last-insert-row-id db))) (define add-reference-sql "INSERT OR REPLACE INTO Refs (referrer, reference) VALUES (:referrer, :reference);") @@ -223,15 +230,13 @@ of course. Returns the row id of the row that was modified or inserted." (define (add-references db referrer references) "REFERRER is the id of the referring store item, REFERENCES is a list ids of items referred to." - (let ((stmt (sqlite-prepare db add-reference-sql #:cache? #t))) + (with-statement db add-reference-sql stmt (for-each (lambda (reference) (sqlite-reset stmt) (sqlite-bind-arguments stmt #:referrer referrer #:reference reference) - (sqlite-fold cons '() stmt) ;execute it - (last-insert-row-id db)) - references) - (sqlite-finalize stmt))) + (sqlite-fold cons '() stmt)) + references))) (define* (sqlite-register db #:key path (references '()) deriver hash nar-size time) -- 2.26.2 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1.5: 0004-database-ensure-update-or-insert-is-run-within-a-tra.patch --] [-- Type: text/x-patch, Size: 5730 bytes --] From 00b220d1c380004a9bb128b73bb8027f2ff156f0 Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt <caleb.ristvedt@cune.org> Date: Mon, 1 Jun 2020 21:43:14 -0500 Subject: [PATCH 4/5] database: ensure update-or-insert is run within a transaction update-or-insert can break if an insert occurs between when it decides whether to update or insert and when it actually performs that operation. Putting the check and the update/insert operation in the same transaction ensures that the update/insert will only succeed if no other write has occurred in the middle. * guix/store/database.scm (call-with-savepoint): new procedure. (update-or-insert): use call-with-savepoint to ensure the read and the insert/update occur within the same transaction. --- .dir-locals.el | 1 + guix/store/database.scm | 68 +++++++++++++++++++++++++++++++++-------- 2 files changed, 56 insertions(+), 13 deletions(-) diff --git a/.dir-locals.el b/.dir-locals.el index 77c12f9411..d9c81b2a48 100644 --- a/.dir-locals.el +++ b/.dir-locals.el @@ -90,6 +90,7 @@ (eval . (put 'with-database 'scheme-indent-function 2)) (eval . (put 'call-with-transaction 'scheme-indent-function 2)) (eval . (put 'with-statement 'scheme-indent-function 3)) + (eval . (put 'call-with-savepoint 'scheme-indent-function 1)) (eval . (put 'call-with-container 'scheme-indent-function 1)) (eval . (put 'container-excursion 'scheme-indent-function 1)) diff --git a/guix/store/database.scm b/guix/store/database.scm index 2a943a7eb0..14aaeef176 100644 --- a/guix/store/database.scm +++ b/guix/store/database.scm @@ -120,6 +120,26 @@ transaction after it finishes." (begin (sqlite-exec db "rollback;") (throw 'sqlite-error who error description)))))) +(define* (call-with-savepoint db proc + #:optional (savepoint-name "SomeSavepoint")) + "Call PROC after creating a savepoint named SAVEPOINT-NAME. If PROC exits +abnormally, rollback to that savepoint. In all cases, remove the savepoint +prior to returning." + (define (exec sql) + (with-statement db sql stmt + (sqlite-fold cons '() stmt))) + + (dynamic-wind + (lambda () + (exec (string-append "SAVEPOINT " savepoint-name ";"))) + (lambda () + (catch #t + proc + (lambda args + (exec (string-append "ROLLBACK TO " savepoint-name ";")) + (apply throw args)))) + (lambda () + (exec (string-append "RELEASE " savepoint-name ";"))))) (define %default-database-file ;; Default location of the store database. @@ -210,19 +230,41 @@ VALUES (:path, :hash, :time, :deriver, :size)") doesn't exactly have... they've got something close, but it involves deleting and re-inserting instead of updating, which causes problems with foreign keys, of course. Returns the row id of the row that was modified or inserted." - (let ((id (path-id db path))) - (if id - (with-statement db update-sql stmt - (sqlite-bind-arguments stmt #:id id - #:deriver deriver - #:hash hash #:size nar-size #:time time) - (sqlite-fold cons '() stmt)) - (with-statement db insert-sql stmt - (sqlite-bind-arguments stmt - #:path path #:deriver deriver - #:hash hash #:size nar-size #:time time) - (sqlite-fold cons '() stmt))) - (last-insert-row-id db))) + + ;; It's important that querying the path-id and the insert/update operation + ;; take place in the same transaction, as otherwise some other + ;; process/thread/fiber could register the same path between when we check + ;; whether it's already registered and when we register it, resulting in + ;; duplicate paths (which, due to a 'unique' constraint, would cause an + ;; exception to be thrown). With the default journaling mode this will + ;; prevent writes from occurring during that sensitive time, but with WAL + ;; mode it will instead arrange to return SQLITE_BUSY when a write occurs + ;; between the start of a read transaction and its upgrading to a write + ;; transaction (see https://sqlite.org/rescode.html#busy_snapshot). + ;; Experimentally, it seems this SQLITE_BUSY will ignore a busy_timeout and + ;; immediately return (makes sense, since waiting won't change anything). + + ;; Note that when that kind of SQLITE_BUSY error is returned, it will keep + ;; being returned every time we try to upgrade the same outermost + ;; transaction to a write transaction. So when retrying, we have to restart + ;; the *outermost* write transaction. We can't inherently tell whether + ;; we're the outermost write transaction, so we leave the retry-handling to + ;; the caller. + (call-with-savepoint db + (lambda () + (let ((id (path-id db path))) + (if id + (with-statement db update-sql stmt + (sqlite-bind-arguments stmt #:id id + #:deriver deriver + #:hash hash #:size nar-size #:time time) + (sqlite-fold cons '() stmt)) + (with-statement db insert-sql stmt + (sqlite-bind-arguments stmt + #:path path #:deriver deriver + #:hash hash #:size nar-size #:time time) + (sqlite-fold cons '() stmt))) + (last-insert-row-id db))))) (define add-reference-sql "INSERT OR REPLACE INTO Refs (referrer, reference) VALUES (:referrer, :reference);") -- 2.26.2 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1.6: 0005-database-separate-transaction-handling-and-retry-han.patch --] [-- Type: text/x-patch, Size: 6216 bytes --] From af87a556dab110ed7a6ef2fa1584778cc60be682 Mon Sep 17 00:00:00 2001 From: Caleb Ristvedt <caleb.ristvedt@cune.org> Date: Mon, 1 Jun 2020 22:15:21 -0500 Subject: [PATCH 5/5] database: separate transaction-handling and retry-handling. Previously call-with-transaction would both retry when SQLITE_BUSY errors were thrown and do what its name suggested (start and rollback/commit a transaction). This changes it to do only what its name implies, which simplifies its implementation. Retrying is provided by the new call-with-SQLITE_BUSY-retrying procedure. * guix/store/database.scm (call-with-transaction): no longer restarts, new #:restartable? argument controls whether "begin" or "begin immediate" is used. (call-with-SQLITE_BUSY-retrying, call-with-retrying-transaction, call-with-retrying-savepoint): new procedures. (register-items): use call-with-retrying-transaction to preserve old behavior. * .dir-locals.el (call-with-retrying-transaction, call-with-retrying-savepoint): add indentation information. --- .dir-locals.el | 2 ++ guix/store/database.scm | 69 +++++++++++++++++++++++++++++------------ 2 files changed, 51 insertions(+), 20 deletions(-) diff --git a/.dir-locals.el b/.dir-locals.el index d9c81b2a48..b88ec7a795 100644 --- a/.dir-locals.el +++ b/.dir-locals.el @@ -90,7 +90,9 @@ (eval . (put 'with-database 'scheme-indent-function 2)) (eval . (put 'call-with-transaction 'scheme-indent-function 2)) (eval . (put 'with-statement 'scheme-indent-function 3)) + (eval . (put 'call-with-retrying-transaction 'scheme-indent-function 2)) (eval . (put 'call-with-savepoint 'scheme-indent-function 1)) + (eval . (put 'call-with-retrying-savepoint 'scheme-indent-function 1)) (eval . (put 'call-with-container 'scheme-indent-function 1)) (eval . (put 'container-excursion 'scheme-indent-function 1)) diff --git a/guix/store/database.scm b/guix/store/database.scm index 14aaeef176..4921e9f0e2 100644 --- a/guix/store/database.scm +++ b/guix/store/database.scm @@ -99,27 +99,44 @@ create it and initialize it as a new database." ;; XXX: missing in guile-sqlite3@0.1.0 (define SQLITE_BUSY 5) -(define (call-with-transaction db proc) - "Start a transaction with DB (make as many attempts as necessary) and run -PROC. If PROC exits abnormally, abort the transaction, otherwise commit the -transaction after it finishes." +(define (call-with-SQLITE_BUSY-retrying thunk) + "Call THUNK, retrying as long as it exits abnormally due to SQLITE_BUSY +errors." (catch 'sqlite-error + thunk + (lambda (key who code errmsg) + (if (= code SQLITE_BUSY) + (call-with-SQLITE_BUSY-retrying thunk) + (throw key who code errmsg))))) + + + +(define* (call-with-transaction db proc #:key restartable?) + "Start a transaction with DB and run PROC. If PROC exits abnormally, abort +the transaction, otherwise commit the transaction after it finishes. +RESTARTABLE? may be set to a non-#f value when it is safe to run PROC multiple +times. This may reduce contention for the database somewhat." + (define (exec sql) + (with-statement db sql stmt + (sqlite-fold cons '() stmt))) + ;; We might use begin immediate here so that if we need to retry, we figure + ;; that out immediately rather than because some SQLITE_BUSY exception gets + ;; thrown partway through PROC - in which case the part already executed + ;; (which may contain side-effects!) might have to be executed again for + ;; every retry. + (exec (if restartable? "begin;" "begin immediate;")) + (catch #t (lambda () - ;; We use begin immediate here so that if we need to retry, we - ;; figure that out immediately rather than because some SQLITE_BUSY - ;; exception gets thrown partway through PROC - in which case the - ;; part already executed (which may contain side-effects!) would be - ;; executed again for every retry. - (sqlite-exec db "begin immediate;") - (let ((result (proc))) - (sqlite-exec db "commit;") - result)) - (lambda (key who error description) - (if (= error SQLITE_BUSY) - (call-with-transaction db proc) - (begin - (sqlite-exec db "rollback;") - (throw 'sqlite-error who error description)))))) + (let-values ((result (proc))) + (exec "commit;") + (apply values result))) + (lambda args + ;; The roll back may or may not have occurred automatically when the + ;; error was generated. If it has occurred, this does nothing but signal + ;; an error. If it hasn't occurred, this needs to be done. + (false-if-exception (exec "rollback;")) + (apply throw args)))) + (define* (call-with-savepoint db proc #:optional (savepoint-name "SomeSavepoint")) "Call PROC after creating a savepoint named SAVEPOINT-NAME. If PROC exits @@ -141,6 +158,18 @@ prior to returning." (lambda () (exec (string-append "RELEASE " savepoint-name ";"))))) +(define* (call-with-retrying-transaction db proc #:key restartable?) + (call-with-SQLITE_BUSY-retrying + (lambda () + (call-with-transaction db proc #:restartable? restartable?)))) + +(define* (call-with-retrying-savepoint db proc + #:optional (savepoint-name + "SomeSavepoint")) + (call-with-SQLITE_BUSY-retrying + (lambda () + (call-with-savepoint db proc savepoint-name)))) + (define %default-database-file ;; Default location of the store database. (string-append %store-database-directory "/db.sqlite")) @@ -433,7 +462,7 @@ Write a progress report to LOG-PORT." (mkdir-p db-dir) (parameterize ((sql-schema schema)) (with-database (string-append db-dir "/db.sqlite") db - (call-with-transaction db + (call-with-retrying-transaction db (lambda () (let* ((prefix (format #f "registering ~a items" (length items))) (progress (progress-reporter/bar (length items) -- 2.26.2 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 487 bytes --] ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [bug#41658] [PATCH] fixes / improvements for (guix store database) 2020-06-08 5:52 ` Caleb Ristvedt @ 2020-06-09 8:42 ` Ludovic Courtès 0 siblings, 0 replies; 6+ messages in thread From: Ludovic Courtès @ 2020-06-09 8:42 UTC (permalink / raw) To: Caleb Ristvedt; +Cc: Danny Milosavljevic, 41658 Hi, Caleb Ristvedt <caleb.ristvedt@cune.org> skribis: [...] >> Nice. It would be great if you could report it upstream (Danny and/or >> myself can then patch it directly in guile-sqlite3 and push out a >> release) and refer to the issue from here. > > Reported as https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12, > with corresponding pull request > https://notabug.org/guile-sqlite3/guile-sqlite3/pulls/13. Awesome, thank you. >> We can have this patch locally in the meantime, unless it would break >> once the new guile-sqlite3 is out. WDYT? > > I've attached an updated patch series that both includes the > guile-sqlite3 fix as a patch to the guile-sqlite3 package and adopts the > workaround for situations where older guile-sqlite3's must be used (for > example, when building guix from scratch on foreign distros that haven't > incorporated the fix yet). The only changes are the addition of the > now-second patch and fixing up some spacing in the comment in the first > patch. OK. >>> * guix/store/database.scm (call-with-savepoint): new procedure. >>> (update-or-insert): use call-with-savepoint to ensure the read and the >>> insert/update occur within the same transaction. >> >> That’s a bit beyond my understanding, but I think you can also push this >> one. :-) > > Basically, it's like combining the body of two separate compare-and-swap > loops into a single compare-and-swap loop. This ensures that the view is > consistent (since if it isn't, the "compare" will fail and we'll > retry). It addresses a problem that doesn't exist in practice yet, since > update-or-insert is only called from within a call-with-transaction > currently. But if someone ever wanted to call it from outside of a > call-with-transaction, this would ensure that it still worked correctly. Makes sense, thanks for explaining. > Related question: does berlin export /var/guix over NFS as per > http://hpc.guix.info/blog/2017/11/installing-guix-on-a-cluster? If so, > that could interact poorly with our use of WAL mode: No, it doesn’t. (Also, in the setup described above, there’s only one guix-daemon instance and it accesses the database via the local file system.) > From 614213c80a7ea15f7aab9502e6c33206ac089d05 Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt <caleb.ristvedt@cune.org> > Date: Mon, 1 Jun 2020 18:50:07 -0500 > Subject: [PATCH 1/5] database: work around guile-sqlite3 bug preventing > statement reset > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > guile-sqlite3 provides statement caching, making it unnecessary for sqlite to > keep re-preparing statements that are frequently used. Unfortunately it > doesn't quite emulate the semantics of sqlite_finalize properly, because it > doesn't cause a commit if the statement being finalized is the last "active" > statement (see https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12). We > work around this by wrapping sqlite-finalize with our own version that ensures > sqlite-reset is called, which does The Right Thing™. > > * guix/store/database.scm (sqlite-finalize): new procedure that shadows the > sqlite-finalize from (sqlite3). [...] > +(define (sqlite-finalize stmt) > + ;; As of guile-sqlite3 0.1.0, cached statements aren't reset when > + ;; sqlite-finalize is invoked on them (see > + ;; https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12). This can > + ;; cause problems with automatically-started transactions: I think it’s enough to link to the upstream issue, which has the problem well documented. > From e3cf7be4491f465d3041933596d3caad1ea64e83 Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt <caleb.ristvedt@cune.org> > Date: Sun, 7 Jun 2020 22:30:41 -0500 > Subject: [PATCH 2/5] gnu: guile-sqlite3: add patch to fix sqlite-finalize bug > > Adds patch that fixes > https://notabug.org/guile-sqlite3/guile-sqlite3/issues/12. This can be > discarded once the patch is integrated into the next guile-sqlite3 release. > Note that the patch is identical to the pull request at > https://notabug.org/guile-sqlite3/guile-sqlite3/pulls/13. > > * gnu/packages/patches/guile-sqlite3-reset-on-sqlite-finalize.patch: new > patch. > * gnu/packages/guile.scm (guile-sqlite3): use it. > * gnu/local.mk (dist_patch_DATA): add it. I’d skip it: we have a workaround and the release may be out soon. Danny, thoughts on getting a new release out? The rest is still fine with me, thank you! Ludo’. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-06-09 8:44 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-06-02 6:31 [bug#41658] [PATCH] fixes / improvements for (guix store database) Caleb Ristvedt 2020-06-04 16:40 ` Ludovic Courtès 2020-06-04 17:00 ` Danny Milosavljevic 2020-06-05 16:19 ` Ludovic Courtès 2020-06-08 5:52 ` Caleb Ristvedt 2020-06-09 8:42 ` Ludovic Courtès
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/guix.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.