From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?Q?G=C3=A1bor_Boskovits?= Subject: Re: Cuirass news Date: Fri, 9 Feb 2018 07:17:50 +0100 Message-ID: References: <877es6x5xj.fsf@gnu.org> <87lggmjjgo.fsf@gmail.com> <87k1w6jjak.fsf@gmail.com> <87h8raxeym.fsf@gnu.org> <20180126153005.259a75e8@scratchpost.org> <87zi4z1eb0.fsf@gnu.org> <20180127181852.42f0bcbf@scratchpost.org> <87fu6bwqix.fsf@gnu.org> <20180208172905.19e9e789@scratchpost.org> <871shvt94p.fsf@gnu.org> <20180209000526.5b9ea8e7@scratchpost.org> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="94eb2c05f71af4c52c0564c17c11" Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:33816) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ek20T-0001JR-NJ for guix-devel@gnu.org; Fri, 09 Feb 2018 01:17:59 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ek20R-0005Mx-Q5 for guix-devel@gnu.org; Fri, 09 Feb 2018 01:17:57 -0500 In-Reply-To: <20180209000526.5b9ea8e7@scratchpost.org> List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Danny Milosavljevic Cc: guix-devel --94eb2c05f71af4c52c0564c17c11 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable 2018-02-09 0:05 GMT+01:00 Danny Milosavljevic : > Hi Ludo, > > On Thu, 08 Feb 2018 23:21:58 +0100 > ludo@gnu.org (Ludovic Court=C3=A8s) wrote: > > > > from a security standpoint - except for db-get-builds, which I'm > amending > > > right now. > > > > Oh sorry, I think I did the same thing as you were sending this message= : > > > > https://git.savannah.gnu.org/cgit/guix/guix-cuirass.git/commit/?id=3D > 8c7c93922bbe0513ff4c4ff3a6e554e3a72635b6 > > > WDYT? > > I'd prefer not to have so many different SQL statements, we get a > combinatorial explosion if we aren't careful (whether we cache or not, > the relational database management system is going to hate us anyway > when we do that). > > But I guess there are not that many yet. > > If we are fine in not being able to search for literal NULL we can use > NULL as > "anything" marker and have a static WHERE statement (this way is > customary). > > Also, I've asked on the sqlite mailing list - ORDER BY cannot support "?"= , > so > those are unavoidable (also, we can't usefully do the ORDER BY ourselves > by sorting the result - because of the LIMIT clause) > > Anyway as long as we are under 10000 statements it should be fine :P > > > Indeed! Should we change =E2=80=98sqlite-finalize=E2=80=99 to a noop w= hen called on a > > cached statement? (Otherwise users would have to keep track of whether > > or not a statement is cached.) > > Hmm maybe that's a good way. But its a little magic. > > If you are not finalizing the statement, it will be reused anyway the nex= t > time > you use the same SQL text. The user just shouldn't call finalize - which > sounds > simple enough for him not to do. > > I think having sqlite-exec detect literal SQL text is a nice middle way. > > If the SQL text is a literal it means it's right there in the source code > and is probably not going to change - how would it? > > Otherwise err on the side of caution and finalize the statement - it's > a little slower but safer that way. I think that would pretty much only > mean db-get-builds. > > Do you think that's too much magic? Or more than the other way? I > wonder... > > I think that if we do this magic we do it right there in the cuirass > database.scm > module and it's never going to move into guile-sqlite3 :) > > On the other hand, if we special-cached sqlite-finalize, we'd have to > provide > sqlite-finalize* or something that does the freeing anyway... > > > Besides, on the big database on berlin, the initial: > > > > (db-get-builds db '((status pending))) > > > > call takes a lot of time and memory. I guess we=E2=80=99re doing somet= hing > > wrong, but I=E2=80=99m not sure what. The same query in the =E2=80=98s= qlite3=E2=80=99 CLI is > > snappy and does not consume much memory. > > WTF. I'll have a look. > > > One of the things we=E2=80=99re doing wrong is that =E2=80=98Outputs=E2= =80=99 table: each > > =E2=80=98db-format-build=E2=80=99 call triggers a lookup in that table.= We should > > instead probably simply store output lists in the =E2=80=98Derivations= =E2=80=99 table. > > Definitely. That's one of the things we should inline into db-get-builds= . > Relational databases are good at joins, let's not to their work for them. > > > Which also means we should have schema versioning and a way to upgrade= =E2=80=A6 > > Yeah. > > I've used this: http://sqitch.org/ for a few projects, if you see it fits the bill I can help to get this working. I liked it, because it supports different databases, it was a big plus for me. It is also nice to be able to declare dependencies of changes. If you have other preferred method for this, the I would like to hear about that. I use these kind of things regulary in my work, it would be nice to get to know new methods. Thanks. > > > I've also reintroduced sqlite-bind-args in a nicer version, please > pull: > > > https://notabug.org/civodul/guile-sqlite3/pulls/3 . > > > > It is OK with you to write it like this: > > Yes, looks good! Thanks! > > --94eb2c05f71af4c52c0564c17c11 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
2018= -02-09 0:05 GMT+01:00 Danny Milosavljevic <dannym@scratchpost.org= >:
Hi Ludo,<= br>
On Thu, 08 Feb 2018 23:21:58 +0100
ludo@gnu.org (Ludovic Court=C3=A8s) wro= te:

> > from a security standpoint - except for db-get-builds, which I= 9;m amending
> > right now.
>
> Oh sorry, I think I did the same thing as you were sending this messag= e:
>
>=C2=A0 =C2=A0https://git.savannah.gnu.org/cgit/guix/guix-= cuirass.git/commit/?id=3D8c7c93922bbe0513ff4c4ff3a6e554e3a72= 635b6

> WDYT?

I'd prefer not to have so many different SQL statements, we get = a
combinatorial explosion if we aren't careful (whether we cache or not,<= br> the relational database management system is going to hate us anyway
when we do that).

But I guess there are not that many yet.

If we are fine in not being able to search for literal NULL we can use NULL= as
"anything" marker and have a static WHERE statement (this way is = customary).

Also, I've asked on the sqlite mailing list - ORDER BY cannot support &= quot;?", so
those are unavoidable (also, we can't usefully do the ORDER BY ourselve= s
by sorting the result - because of the LIMIT clause)

Anyway as long as we are under 10000 statements it should be fine :P

> Indeed!=C2=A0 Should we change =E2=80=98sqlite-finalize=E2=80=99 to a = noop when called on a
> cached statement?=C2=A0 (Otherwise users would have to keep track of w= hether
> or not a statement is cached.)

Hmm maybe that's a good way.=C2=A0 But its a little magic.

If you are not finalizing the statement, it will be reused anyway the next = time
you use the same SQL text.=C2=A0 The user just shouldn't call finalize = - which sounds
simple enough for him not to do.

I think having sqlite-exec detect literal SQL text is a nice middle way.
If the SQL text is a literal it means it's right there in the source co= de
and is probably not going to change - how would it?

Otherwise err on the side of caution and finalize the statement - it's<= br> a little slower but safer that way.=C2=A0 I think that would pretty much on= ly
mean db-get-builds.

Do you think that's too much magic?=C2=A0 Or more than the other way?= =C2=A0 I wonder...

I think that if we do this magic we do it right there in the cuirass databa= se.scm
module and it's never going to move into guile-sqlite3 :)

On the other hand, if we special-cached sqlite-finalize, we'd have to p= rovide
sqlite-finalize* or something that does the freeing anyway...

> Besides, on the big database on berlin, the initial:
>
>=C2=A0 =C2=A0(db-get-builds db '((status pending)))
>
> call takes a lot of time and memory.=C2=A0 I guess we=E2=80=99re doing= something
> wrong, but I=E2=80=99m not sure what.=C2=A0 The same query in the =E2= =80=98sqlite3=E2=80=99 CLI is
> snappy and does not consume much memory.

WTF.=C2=A0 I'll have a look.

> One of the things we=E2=80=99re doing wrong is that =E2=80=98Outputs= =E2=80=99 table: each
> =E2=80=98db-format-build=E2=80=99 call triggers a lookup in that table= .=C2=A0 We should
> instead probably simply store output lists in the =E2=80=98Derivations= =E2=80=99 table.

Definitely.=C2=A0 That's one of the things we should inline into= db-get-builds.
Relational databases are good at joins, let's not to their work for the= m.

> Which also means we should have schema versioning and a way to upgrade= =E2=80=A6

Yeah.

I've used this:=C2= =A0http://sqitch.org/=C2=A0for a few pro= jects, if you see it fits the bill
I can help to get this working= . I liked it, because it supports different databases,
it was a b= ig plus for me. It is also nice to be able to declare dependencies of chang= es.

If you have other preferred method for this, t= he I would like to hear about that.
I use these kind of things re= gulary in my work, it would be nice to get to know new
methods. T= hanks.
=C2=A0
> > I've also reintroduced sqlite-bind-args in a nicer version, p= lease pull:
> > https://notabug.org/civodul/guile-sq= lite3/pulls/3 .
>
> It is OK with you to write it like this:

Yes, looks good!=C2=A0 Thanks!


--94eb2c05f71af4c52c0564c17c11--