From: ludo@gnu.org (Ludovic Courtès)
To: "Clément Lassieur" <clement@lassieur.org>
Cc: 32190@debbugs.gnu.org
Subject: bug#32190: Cuirass doesn't check if two subsequent jobs yield the same derivation
Date: Tue, 24 Jul 2018 12:05:35 +0200 [thread overview]
Message-ID: <877ell54pc.fsf@gnu.org> (raw)
In-Reply-To: <87efg1ijdp.fsf@lassieur.org> ("Clément Lassieur"'s message of "Wed, 18 Jul 2018 00:32:02 +0200")
Hi Clément,
Clément Lassieur <clement@lassieur.org> skribis:
> Consider the following table:
>
> CREATE TABLE Derivations (
> derivation TEXT NOT NULL,
> evaluation INTEGER NOT NULL,
> job_name TEXT NOT NULL,
> system TEXT NOT NULL,
> nix_name TEXT NOT NULL,
> PRIMARY KEY (derivation, evaluation),
> FOREIGN KEY (evaluation) REFERENCES Evaluations (id)
> );
>
>
> And the following code:
>
> (define (db-add-derivation db job)
> "Store a derivation result in database DB and return its ID."
> (catch 'sqlite-error
> (lambda ()
> (sqlite-exec db "\
> INSERT INTO Derivations (derivation, job_name, system, nix_name, evaluation)\
> VALUES ("
> (assq-ref job #:derivation) ", "
> (assq-ref job #:job-name) ", "
> (assq-ref job #:system) ", "
> (assq-ref job #:nix-name) ", "
> (assq-ref job #:eval-id) ");")
> (last-insert-rowid db))
> (lambda (key who code message . rest)
> ;; If we get a unique-constraint-failed error, that means we have
> ;; already inserted the same (derivation,eval-id) tuple. That happens
> ;; when several jobs produce the same derivation, and we can ignore it.
> (if (= code SQLITE_CONSTRAINT_PRIMARYKEY)
> (sqlite-exec db "SELECT * FROM Derivations WHERE derivation="
> (assq-ref job #:derivation) ";")
> (apply throw key who code rest)))))
>
> I think the above constraint can't happen because by definition a new
> job (for the same job_name) is produced at each evaluation. So eval-id
> will be incremented every time.
I added it at the time because it did happen. In a given eval, there
can be two jobs producing the same derivation (for instance a job called
“gcc” produces xyz-gcc-5.5.0.drv, and a job called “gcc-5.5.0” produces
the very same xyz-gcc-5.5.0.drv.)
> Also, the docs (and a comment in schema.sql) says:
>
> Builds are not in a one to one relationship with derivations in
> order to keep track of non deterministic compilations.
>
> But I think it doesn't make sense, because Guix won't try to build twice
> the same thing unless '--check' is used (which obviously isn't the
> case).
The rationale (that was back in Mathieu’s GSoC) was that sometimes, you
can have several builds logs for one derivation. In Hydra this happens
if a build fails for some non-deterministic reason and then you click on
“Restart” in the hope that it’ll succeed this time. ;-) In this
situation Hydra keeps both build logs IIRC.
Anyway, I lean towards keeping only one build log, at least for now,
which is what guix-daemon does.
> So not only we have a huge Derivations table full of identical items,
> but we also ask Guix to build them and we store the results in the
> Builds table...
>
> Maybe the solution is to replace the (derivation, evaluation) primary
> key with (derivation), and only build the newly added derivations.
> WDYT?
I agree, we don’t need all these identical items, it makes no sense.
You can go ahead and clean that up! ;-)
Thank you,
Ludo’.
next prev parent reply other threads:[~2018-07-24 10:06 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAMSS15C0nNEPqQKjRt9=-JKFvrGZsKBtMOxju65p_y88EzOZgg@mail.gmail.com>
[not found] ` <87vac3twbe.fsf@gnu.org>
[not found] ` <87o9hog2ye.fsf@elephly.net>
[not found] ` <CAMSS15CQwypdFafPG1ii-CgMpxYxLtcbx8hHTvDvyBS+6xNSxA@mail.gmail.com>
[not found] ` <87d0xyn9zs.fsf@elephly.net>
[not found] ` <CAMSS15CGdzb8=-Oz6z6bmeWjrnT85PL+q7DLUFw-9E4_d4Y6pw@mail.gmail.com>
[not found] ` <87d0xswvls.fsf@elephly.net>
[not found] ` <CAMSS15CDGat-pFjiz2vrkvb14qWnY4rbCW-d8KSzY6MO7WzT_g@mail.gmail.com>
[not found] ` <87r2m4ntk4.fsf@mdc-berlin.de>
[not found] ` <CAMSS15D42gCV-UneBBJWngdfi-jG4JfWWh6NWhWMiet0Y=bUsg@mail.gmail.com>
[not found] ` <87tvqxy4i9.fsf@elephly.net>
[not found] ` <CAMSS15DThnLO+YEVaBmJ9ozMeu4mO1rHAdXHgZ8K+Csu40pORQ@mail.gmail.com>
[not found] ` <87in78hxo2.fsf@elephly.net>
2018-07-17 19:31 ` GSoC: Adding a web interface similar to the Hydra web interface Clément Lassieur
2018-07-17 22:32 ` bug#32190: Cuirass doesn't check if two subsequent jobs yield the same derivation Clément Lassieur
2018-07-24 10:05 ` Ludovic Courtès [this message]
2018-08-04 16:03 ` bug#32190: [PATCH] database: Merge Derivations into Builds table Clément Lassieur
2018-08-04 16:09 ` Clément Lassieur
2018-08-08 12:13 ` Clément Lassieur
2018-08-14 16:57 ` Clément Lassieur
2018-08-14 19:04 ` Ricardo Wurmus
2018-08-15 18:57 ` Clément Lassieur
2018-08-16 21:00 ` Clément Lassieur
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877ell54pc.fsf@gnu.org \
--to=ludo@gnu.org \
--cc=32190@debbugs.gnu.org \
--cc=clement@lassieur.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).