unofficial mirror of guix-patches@gnu.org 
 help / color / mirror / code / Atom feed
* [bug#32879] [PATCH] database: Add builds only if one of their outputs is new.
@ 2018-09-29 20:35 Clément Lassieur
  2018-09-30 19:34 ` Ludovic Courtès
  0 siblings, 1 reply; 6+ messages in thread
From: Clément Lassieur @ 2018-09-29 20:35 UTC (permalink / raw)
  To: 32879

* Makefile.am (dist_sql_DATA): Add 'src/sql/upgrade-4.sql'.
* src/cuirass/database.scm (db-add-output): New procedure.
(db-add-build): Call DB-ADD-OUTPUT, rollback the transaction and return #f if
DB-ADD-OUTPUT returned an empty list.
* src/schema.sql (Outputs): Set 'path' as primary key, instead of 'derivation,
name'.
* src/sql/upgrade-4.sql: New file with SQL queries to upgrade the database.
* tests/database.scm (make-dummy-build): Use the #:OUTPUTS key.  Get default
OUTPUTS to depend on DRV.
("db-add-build-with-fixed-output"): New test.
---
 Makefile.am              |  3 ++-
 src/cuirass/database.scm | 46 +++++++++++++++++++++++++++++-----------
 src/schema.sql           |  3 +--
 src/sql/upgrade-4.sql    | 18 ++++++++++++++++
 tests/database.scm       | 16 ++++++++++++--
 5 files changed, 69 insertions(+), 17 deletions(-)
 create mode 100644 src/sql/upgrade-4.sql

diff --git a/Makefile.am b/Makefile.am
index 2f83659..7cea2ff 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -67,7 +67,8 @@ dist_pkgdata_DATA = src/schema.sql
 dist_sql_DATA = 				\
   src/sql/upgrade-1.sql				\
   src/sql/upgrade-2.sql				\
-  src/sql/upgrade-3.sql
+  src/sql/upgrade-3.sql				\
+  src/sql/upgrade-4.sql
 
 dist_css_DATA =					\
   src/static/css/bootstrap.css			\
diff --git a/src/cuirass/database.scm b/src/cuirass/database.scm
index 6777d28..9664f1b 100644
--- a/src/cuirass/database.scm
+++ b/src/cuirass/database.scm
@@ -425,12 +425,33 @@ string."
   (failed-other      3)
   (canceled          4))
 
+(define (db-add-output derivation output)
+  "Insert OUTPUT associated with DERIVATION.  If an output with the same path
+already exists, return #f."
+  (with-db-critical-section db
+    (catch 'sqlite-error
+      (lambda ()
+        (match output
+          ((name . path)
+           (sqlite-exec db "\
+INSERT INTO Outputs (derivation, name, path) VALUES ("
+                        derivation ", " name ", " path ");")))
+        (last-insert-rowid db))
+      (lambda (key who code message . rest)
+        ;; If we get a unique-constraint-failed error, that means we have
+        ;; already inserted the same output.  That happens with fixed-output
+        ;; derivations.
+        (if (= code SQLITE_CONSTRAINT_PRIMARYKEY)
+            #f
+            (apply throw key who code rest))))))
+
 (define (db-add-build build)
-  "Store BUILD in database the database.  BUILD eventual outputs are stored in
-the OUTPUTS table."
+  "Store BUILD in database the database only if one of its outputs is new.
+Return #f otherwise.  BUILD outputs are stored in the OUTPUTS table."
   (with-db-critical-section db
     (catch 'sqlite-error
       (lambda ()
+        (sqlite-exec db "BEGIN TRANSACTION;")
         (sqlite-exec db "
 INSERT INTO Builds (derivation, evaluation, job_name, system, nix_name, log,
 status, timestamp, starttime, stoptime)
@@ -446,21 +467,22 @@ VALUES ("
                      (or (assq-ref build #:timestamp) 0) ", "
                      (or (assq-ref build #:starttime) 0) ", "
                      (or (assq-ref build #:stoptime) 0) ");")
-        (let ((derivation (assq-ref build #:derivation)))
-          (for-each (lambda (output)
-                      (match output
-                        ((name . path)
-                         (sqlite-exec db "\
-INSERT INTO Outputs (derivation, name, path) VALUES ("
-                                      derivation ", " name ", " path ");"))))
-                    (assq-ref build #:outputs))
-          derivation))
+        (let* ((derivation (assq-ref build #:derivation))
+               (outputs (assq-ref build #:outputs))
+               (new-outputs (filter-map (cut db-add-output derivation <>)
+                                        outputs)))
+          (if (null? new-outputs)
+              (begin (sqlite-exec db "ROLLBACK;")
+                     #f)
+              (begin (sqlite-exec db "COMMIT;")
+                     derivation))))
       (lambda (key who code message . rest)
         ;; If we get a unique-constraint-failed error, that means we have
         ;; already inserted the same build.  That happens when several jobs
         ;; produce the same derivation, and we can ignore it.
         (if (= code SQLITE_CONSTRAINT_PRIMARYKEY)
-            #f
+            (begin (sqlite-exec db "ROLLBACK;")
+                   #f)
             (apply throw key who code rest))))))
 
 (define* (db-update-build-status! drv status #:key log-file)
diff --git a/src/schema.sql b/src/schema.sql
index bfc9ca7..a9e4a6a 100644
--- a/src/schema.sql
+++ b/src/schema.sql
@@ -46,8 +46,7 @@ CREATE TABLE Evaluations (
 CREATE TABLE Outputs (
   derivation TEXT NOT NULL,
   name TEXT NOT NULL,
-  path TEXT NOT NULL,
-  PRIMARY KEY (derivation, name),
+  path TEXT NOT NULL PRIMARY KEY,
   FOREIGN KEY (derivation) REFERENCES Builds (derivation)
 );
 
diff --git a/src/sql/upgrade-4.sql b/src/sql/upgrade-4.sql
new file mode 100644
index 0000000..e567f03
--- /dev/null
+++ b/src/sql/upgrade-4.sql
@@ -0,0 +1,18 @@
+BEGIN TRANSACTION;
+
+ALTER TABLE Outputs RENAME TO tmp_Outputs;
+
+CREATE TABLE Outputs (
+  derivation TEXT NOT NULL,
+  name TEXT NOT NULL,
+  path TEXT NOT NULL PRIMARY KEY,
+  FOREIGN KEY (derivation) REFERENCES Builds (derivation)
+);
+
+INSERT OR IGNORE INTO Outputs (derivation, name, path)
+SELECT derivation, name, path
+FROM tmp_Outputs;
+
+DROP TABLE tmp_Outputs;
+
+COMMIT;
diff --git a/tests/database.scm b/tests/database.scm
index 21a12f4..d9dfe13 100644
--- a/tests/database.scm
+++ b/tests/database.scm
@@ -57,14 +57,15 @@
 
 (define* (make-dummy-build drv
                            #:optional (eval-id 42)
-                           #:key (outputs '(("foo" . "/foo"))))
+                           #:key (outputs
+                                  `(("foo" . ,(format #f "~a.output" drv)))))
   `((#:derivation . ,drv)
     (#:eval-id . ,eval-id)
     (#:job-name . "job")
     (#:system . "x86_64-linux")
     (#:nix-name . "foo")
     (#:log . "log")
-    (#:outputs . (("foo" . "/foo")))))
+    (#:outputs . ,outputs)))
 
 (define-syntax-rule (with-temporary-database body ...)
   (call-with-temporary-output-file
@@ -114,6 +115,17 @@ INSERT INTO Evaluations (specification, in_progress) VALUES (3, false);")
       ;; there, see <https://bugs.gnu.org/28094>.
       (db-add-build build)))
 
+  (test-equal "db-add-build-with-fixed-output"
+    #f
+    (let ((build1 (make-dummy-build "/fixed1.drv"
+                                    #:outputs '(("out" . "/fixed-output"))))
+          (build2 (make-dummy-build "/fixed2.drv"
+                                    #:outputs '(("out" . "/fixed-output")))))
+      (db-add-build build1)
+
+      ;; Should return #f because the outputs are the same.
+      (db-add-build build2)))
+
   (test-equal "db-update-build-status!"
     (list (build-status scheduled)
           (build-status started)
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [bug#32879] [PATCH] database: Add builds only if one of their outputs is new.
  2018-09-29 20:35 [bug#32879] [PATCH] database: Add builds only if one of their outputs is new Clément Lassieur
@ 2018-09-30 19:34 ` Ludovic Courtès
  2018-09-30 20:41   ` Clément Lassieur
  0 siblings, 1 reply; 6+ messages in thread
From: Ludovic Courtès @ 2018-09-30 19:34 UTC (permalink / raw)
  To: Clément Lassieur; +Cc: 32879

Hello Clément,

Clément Lassieur <clement@lassieur.org> skribis:

> * Makefile.am (dist_sql_DATA): Add 'src/sql/upgrade-4.sql'.
> * src/cuirass/database.scm (db-add-output): New procedure.
> (db-add-build): Call DB-ADD-OUTPUT, rollback the transaction and return #f if
> DB-ADD-OUTPUT returned an empty list.
> * src/schema.sql (Outputs): Set 'path' as primary key, instead of 'derivation,
> name'.
> * src/sql/upgrade-4.sql: New file with SQL queries to upgrade the database.
> * tests/database.scm (make-dummy-build): Use the #:OUTPUTS key.  Get default
> OUTPUTS to depend on DRV.
> ("db-add-build-with-fixed-output"): New test.

What’s the rationale?  I suppose having a simpler primary key for
‘Outputs’ might help performance?

Thank you,
Ludo’.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [bug#32879] [PATCH] database: Add builds only if one of their outputs is new.
  2018-09-30 19:34 ` Ludovic Courtès
@ 2018-09-30 20:41   ` Clément Lassieur
  2018-10-01  9:09     ` Clément Lassieur
  0 siblings, 1 reply; 6+ messages in thread
From: Clément Lassieur @ 2018-09-30 20:41 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 32879

Ludovic Courtès <ludo@gnu.org> writes:

> Hello Clément,
>
> Clément Lassieur <clement@lassieur.org> skribis:
>
>> * Makefile.am (dist_sql_DATA): Add 'src/sql/upgrade-4.sql'.
>> * src/cuirass/database.scm (db-add-output): New procedure.
>> (db-add-build): Call DB-ADD-OUTPUT, rollback the transaction and return #f if
>> DB-ADD-OUTPUT returned an empty list.
>> * src/schema.sql (Outputs): Set 'path' as primary key, instead of 'derivation,
>> name'.
>> * src/sql/upgrade-4.sql: New file with SQL queries to upgrade the database.
>> * tests/database.scm (make-dummy-build): Use the #:OUTPUTS key.  Get default
>> OUTPUTS to depend on DRV.
>> ("db-add-build-with-fixed-output"): New test.
>
> What’s the rationale?  I suppose having a simpler primary key for
> ‘Outputs’ might help performance?

There is a slight performance and db size gain but the primary reason is
to have a better idea of Cuirass' load when looking at the pending
builds.  There will be less (no?) 'fake' builds.  The idea is that all
builds should be real builds.

Does that make sense?

> Thank you,
> Ludo’.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [bug#32879] [PATCH] database: Add builds only if one of their outputs is new.
  2018-09-30 20:41   ` Clément Lassieur
@ 2018-10-01  9:09     ` Clément Lassieur
  2018-10-02  9:08       ` Ludovic Courtès
  0 siblings, 1 reply; 6+ messages in thread
From: Clément Lassieur @ 2018-10-01  9:09 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 32879

Clément Lassieur <clement@lassieur.org> writes:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Hello Clément,
>>
>> Clément Lassieur <clement@lassieur.org> skribis:
>>
>>> * Makefile.am (dist_sql_DATA): Add 'src/sql/upgrade-4.sql'.
>>> * src/cuirass/database.scm (db-add-output): New procedure.
>>> (db-add-build): Call DB-ADD-OUTPUT, rollback the transaction and return #f if
>>> DB-ADD-OUTPUT returned an empty list.
>>> * src/schema.sql (Outputs): Set 'path' as primary key, instead of 'derivation,
>>> name'.
>>> * src/sql/upgrade-4.sql: New file with SQL queries to upgrade the database.
>>> * tests/database.scm (make-dummy-build): Use the #:OUTPUTS key.  Get default
>>> OUTPUTS to depend on DRV.
>>> ("db-add-build-with-fixed-output"): New test.
>>
>> What’s the rationale?  I suppose having a simpler primary key for
>> ‘Outputs’ might help performance?
>
> There is a slight performance and db size gain but the primary reason is
> to have a better idea of Cuirass' load when looking at the pending
> builds.  There will be less (no?) 'fake' builds.  The idea is that all
> builds should be real builds.

Also, it would allow to be notified when someone pushes on master a
commit that triggers too many builds.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [bug#32879] [PATCH] database: Add builds only if one of their outputs is new.
  2018-10-01  9:09     ` Clément Lassieur
@ 2018-10-02  9:08       ` Ludovic Courtès
  2018-10-02 11:26         ` bug#32879: " Clément Lassieur
  0 siblings, 1 reply; 6+ messages in thread
From: Ludovic Courtès @ 2018-10-02  9:08 UTC (permalink / raw)
  To: Clément Lassieur; +Cc: 32879

Hello!

Clément Lassieur <clement@lassieur.org> skribis:

> Clément Lassieur <clement@lassieur.org> writes:
>
>> Ludovic Courtès <ludo@gnu.org> writes:
>>
>>> Hello Clément,
>>>
>>> Clément Lassieur <clement@lassieur.org> skribis:
>>>
>>>> * Makefile.am (dist_sql_DATA): Add 'src/sql/upgrade-4.sql'.
>>>> * src/cuirass/database.scm (db-add-output): New procedure.
>>>> (db-add-build): Call DB-ADD-OUTPUT, rollback the transaction and return #f if
>>>> DB-ADD-OUTPUT returned an empty list.
>>>> * src/schema.sql (Outputs): Set 'path' as primary key, instead of 'derivation,
>>>> name'.
>>>> * src/sql/upgrade-4.sql: New file with SQL queries to upgrade the database.
>>>> * tests/database.scm (make-dummy-build): Use the #:OUTPUTS key.  Get default
>>>> OUTPUTS to depend on DRV.
>>>> ("db-add-build-with-fixed-output"): New test.
>>>
>>> What’s the rationale?  I suppose having a simpler primary key for
>>> ‘Outputs’ might help performance?
>>
>> There is a slight performance and db size gain but the primary reason is
>> to have a better idea of Cuirass' load when looking at the pending
>> builds.  There will be less (no?) 'fake' builds.  The idea is that all
>> builds should be real builds.
>
> Also, it would allow to be notified when someone pushes on master a
> commit that triggers too many builds.

That makes a lot of sense, thank you!

IMO you can go ahead and push it.

Ludo’.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#32879: [PATCH] database: Add builds only if one of their outputs is new.
  2018-10-02  9:08       ` Ludovic Courtès
@ 2018-10-02 11:26         ` Clément Lassieur
  0 siblings, 0 replies; 6+ messages in thread
From: Clément Lassieur @ 2018-10-02 11:26 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 32879-done

Hi Ludo,

Ludovic Courtès <ludo@gnu.org> writes:

> That makes a lot of sense, thank you!
>
> IMO you can go ahead and push it.

Great!  Thank you for the review.

Clément

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-10-02 11:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-09-29 20:35 [bug#32879] [PATCH] database: Add builds only if one of their outputs is new Clément Lassieur
2018-09-30 19:34 ` Ludovic Courtès
2018-09-30 20:41   ` Clément Lassieur
2018-10-01  9:09     ` Clément Lassieur
2018-10-02  9:08       ` Ludovic Courtès
2018-10-02 11:26         ` bug#32879: " Clément Lassieur

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).