From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) Subject: Re: Cuirass news Date: Fri, 09 Feb 2018 10:41:13 +0100 Message-ID: <87wozm4i12.fsf@gnu.org> References: <877es6x5xj.fsf@gnu.org> <87lggmjjgo.fsf@gmail.com> <87k1w6jjak.fsf@gmail.com> <87h8raxeym.fsf@gnu.org> <20180126153005.259a75e8@scratchpost.org> <87zi4z1eb0.fsf@gnu.org> <20180127181852.42f0bcbf@scratchpost.org> <87fu6bwqix.fsf@gnu.org> <20180208172905.19e9e789@scratchpost.org> <871shvt94p.fsf@gnu.org> <20180209000526.5b9ea8e7@scratchpost.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:48364) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ek5BJ-0007dO-SP for guix-devel@gnu.org; Fri, 09 Feb 2018 04:41:23 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ek5BF-0003qJ-Qh for guix-devel@gnu.org; Fri, 09 Feb 2018 04:41:21 -0500 Received: from hera.aquilenet.fr ([2a0c:e300::1]:53956) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ek5BF-0003on-Ev for guix-devel@gnu.org; Fri, 09 Feb 2018 04:41:17 -0500 In-Reply-To: <20180209000526.5b9ea8e7@scratchpost.org> (Danny Milosavljevic's message of "Fri, 9 Feb 2018 00:05:26 +0100") List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Danny Milosavljevic Cc: guix-devel Hello, Danny Milosavljevic skribis: > On Thu, 08 Feb 2018 23:21:58 +0100 > ludo@gnu.org (Ludovic Court=C3=A8s) wrote: > >> > from a security standpoint - except for db-get-builds, which I'm amend= ing >> > right now.=20=20 >>=20 >> Oh sorry, I think I did the same thing as you were sending this message: >>=20 >> https://git.savannah.gnu.org/cgit/guix/guix-cuirass.git/commit/?id=3D8= c7c93922bbe0513ff4c4ff3a6e554e3a72635b6 > >> WDYT? > > I'd prefer not to have so many different SQL statements, we get a > combinatorial explosion if we aren't careful (whether we cache or not, > the relational database management system is going to hate us anyway > when we do that). > > But I guess there are not that many yet. > > If we are fine in not being able to search for literal NULL we can use NU= LL as > "anything" marker and have a static WHERE statement (this way is customar= y). > > Also, I've asked on the sqlite mailing list - ORDER BY cannot support "?"= , so > those are unavoidable (also, we can't usefully do the ORDER BY ourselves > by sorting the result - because of the LIMIT clause) > > Anyway as long as we are under 10000 statements it should be fine :P Yes. Also, in practice, everyone=E2=80=99s going to make the same /api/* requests (because there are only two clients, the Emacs and the Web UI, and they typically always do the same requests), which in turn means we=E2=80=99ll always get the same =E2=80=98db-get-builds=E2=80=99 call, pos= sibly with just a different limit, but it=E2=80=99s still the same statement. So I think we should be fine. >> Indeed! Should we change =E2=80=98sqlite-finalize=E2=80=99 to a noop wh= en called on a >> cached statement? (Otherwise users would have to keep track of whether >> or not a statement is cached.) > > Hmm maybe that's a good way. But its a little magic. Yes, but I think we have no other option: now that caching is built into sqlite3.scm, it has to be properly handled by all of that module. For the user, it should be a simple matter of choosing #:cache? #t or #:cache? #f, and then (sqlite3) should just DTRT. Otherwise we=E2=80=99d have to remove caching from (sqlite3) altogether, IM= O. WDYT? >> Besides, on the big database on berlin, the initial: >>=20 >> (db-get-builds db '((status pending))) >>=20 >> call takes a lot of time and memory. I guess we=E2=80=99re doing someth= ing >> wrong, but I=E2=80=99m not sure what. The same query in the =E2=80=98sq= lite3=E2=80=99 CLI is >> snappy and does not consume much memory. > > WTF. I'll have a look. Great. :-) For the record, this is the GC profile I get (take it with a grain of salt: it tells us what part of the code allocates, not what the live objects are): --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> (define lst (gcprof (lambda () (db-get-builds db '((st= atus pending)))))) % cumulative self time seconds seconds procedure 33.82 468.32 468.32 bytevector->pointer 22.38 1361.15 309.97 cuirass/database.scm:347:0:db-format-build 11.19 154.98 154.98 hashq-set! 6.33 87.60 87.60 make-bytevector 4.62 64.01 64.01 utf8->string 3.89 1051.19 53.91 cuirass/database.scm:324:0:db-get-outputs 2.92 40.43 40.43 apply-smob/1 2.68 37.06 37.06 dereference-pointer 2.43 33.69 33.69 anon #x28e3088 1.46 1010.76 20.22 cuirass/database.scm:55:0:%sqlite-exec 1.46 20.22 20.22 srfi/srfi-1.scm:269:0:iota 1.22 47.17 16.85 ice-9/boot-9.scm:789:2:catch 1.22 16.85 16.85 ice-9/boot-9.scm:777:2:throw 1.22 16.85 16.85 string->utf8 0.97 13.48 13.48 ice-9/boot-9.scm:701:2:make-prompt-tag 0.73 1384.74 10.11 cuirass/database.scm:375:0:db-get-builds 0.73 10.11 10.11 hash-set! 0.49 6.74 6.74 cons 0.24 3.37 3.37 pointer->bytevector 0.00 4462655.56 0.00 /gnu/store/cxxyk9bdas4n7m6zlhdhnm7ixxkw3b0b-pr= ofile/share/guile/site/2.2/sqlite3.scm:510:2:lp 0.00 815.34 0.00 /gnu/store/cxxyk9bdas4n7m6zlhdhnm7ixxkw3b0b-pro= file/share/guile/site/2.2/sqlite3.scm:311:0:sqlite-prepare 0.00 805.24 0.00 /gnu/store/cxxyk9bdas4n7m6zlhdhnm7ixxkw3b0b-pro= file/share/guile/site/2.2/sqlite3.scm:286:4 0.00 101.08 0.00 /gnu/store/cxxyk9bdas4n7m6zlhdhnm7ixxkw3b0b-pro= file/share/guile/site/2.2/sqlite3.scm:474:0:sqlite-row 0.00 47.17 0.00 /gnu/store/cxxyk9bdas4n7m6zlhdhnm7ixxkw3b0b-pro= file/share/guile/site/2.2/sqlite3.scm:223:2:sqlite-remove-statement! 0.00 47.17 0.00 /gnu/store/cxxyk9bdas4n7m6zlhdhnm7ixxkw3b0b-pro= file/share/guile/site/2.2/sqlite3.scm:241:4 0.00 37.06 0.00 /gnu/store/cxxyk9bdas4n7m6zlhdhnm7ixxkw3b0b-pro= file/share/guile/site/2.2/sqlite3.scm:444:4 0.00 16.85 0.00 /gnu/store/cxxyk9bdas4n7m6zlhdhnm7ixxkw3b0b-pro= file/share/guile/site/2.2/sqlite3.scm:227:22 0.00 16.85 0.00 hash-for-each --- Sample count: 411 Total time: 1384.735520913 seconds (1331.66193132 seconds in GC) --8<---------------cut here---------------end--------------->8--- This query takes ~5 seconds with the CLI (which is still way too much): --8<---------------cut here---------------start------------->8--- sqlite> select count(*) from builds inner join Derivations ON Builds.deriva= tion =3D Derivations.derivation and Builds.evaluation =3D Derivations.evalu= ation ...> INNER JOIN Evaluations ON Derivations.evaluation =3D Evaluations.id ...> INNER JOIN Specifications ON Evaluations.specification =3D Specific= ations.repo_name; 2156003 --8<---------------cut here---------------end--------------->8--- >> One of the things we=E2=80=99re doing wrong is that =E2=80=98Outputs=E2= =80=99 table: each >> =E2=80=98db-format-build=E2=80=99 call triggers a lookup in that table. = We should >> instead probably simply store output lists in the =E2=80=98Derivations= =E2=80=99 table. > > Definitely. That's one of the things we should inline into db-get-builds. > Relational databases are good at joins, let's not to their work for them. Right. >> > I've also reintroduced sqlite-bind-args in a nicer version, please pul= l: >> > https://notabug.org/civodul/guile-sqlite3/pulls/3 .=20=20 >>=20 >> It is OK with you to write it like this: > > Yes, looks good! Thanks! Pushed! Thanks, Ludo=E2=80=99.