From mboxrd@z Thu Jan 1 00:00:00 1970 From: Danny Milosavljevic Subject: Re: [Cuirass] Missing database indexes? Date: Tue, 13 Nov 2018 00:27:15 +0100 Message-ID: <20181113002715.33a00ea1@scratchpost.org> References: <87va54yh0c.fsf@gnu.org> <20181110211128.6dc522da@alma-ubu> <87k1ljr1c7.fsf@gnu.org> <20181112195044.6d64f51c@alma-ubu> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; boundary="Sig_//O97PwsAArqo3puDT=ahYa2"; protocol="application/pgp-signature" Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:51064) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gMLcB-0003zK-I3 for guix-devel@gnu.org; Mon, 12 Nov 2018 18:27:32 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gMLc6-0001eG-6K for guix-devel@gnu.org; Mon, 12 Nov 2018 18:27:30 -0500 In-Reply-To: <20181112195044.6d64f51c@alma-ubu> List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: =?ISO-8859-1?Q?Bj=F6rn_H=F6fling?= Cc: guix-devel@gnu.org, clement@lassieur.org --Sig_//O97PwsAArqo3puDT=ahYa2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Bj=C3=B6rn, On Mon, 12 Nov 2018 19:50:44 +0100 Bj=C3=B6rn H=C3=B6fling wrote: > Hm. This code smells ... It looks too complicated. I was trying to cut down the number of prepared statements in use and preve= nt a combinatorial explosion while keeping the kinds of queries we can do open. Either the value of a parameter is specified, in which case the associated = column is filtered for it; or it isn't, then it's not. > So, even when we have a constant part(23=3D23) in the OR clause, this > leads to a full table scan. I think the optimizer cannot detect the > fact that it is a constant boolean value.=20 Sounds like an easy fix in sqlite. Could you report this upstream? > I double-checked with Postgresql and it is also performing a full table > scan in the "boolean-constant OR :id=3Did" case. I could not find any > references on the net about it. Something easy to try would be to use the row values in sqlite instead. See also https://www.sqlite.org/rowvalue.html See also https://lists.gnu.org/archive/html/guix-devel/2018-07/msg00101.html > When this would be Java/JPA I would suggest to dynamically create the > query. Can we do something in Scheme-DB too? I.e. pseudo-code (1) Combinatorial explosion of the number of prepared statements (2) Slow to parse all those SQL statements But if we don't use all combinations in practise then it's not so bad and we could generate those statements after all. It's still a workaround if we have to do that. Then we'd have to make sure that the user can't specify arbitrary combinations and/or limit the number of prepared statements that exist at the same time. Cuirass is a web-facing service so these are not theoretical problems - someone *will* break it (maybe on purpose) if it's possible. All in all, generating these SQL statements is like generating a program source code and recompiling it every time you want to change the filter used. It might be the only good solution in this case (maybe) - but in general it's an antipattern. > We should add a column: >=20 > id INTEGER PRIMARY KEY AUTOINCREMENT >=20 > Problem is that this concept of AUTOINCREMENT does only work for > Primary Keys in Sqlite. So we need to degrade "derivation" to a > secondary key, i.e. make it non-null and unique: >=20 > derivation TEXT NOT NULL UNIQUE, >=20 > Is there anything speaking against that? Sounds good. Note that when you use autoincrement, you eventually have to handle the case when the value overflows. The window of used IDs slowly creeps up over the months. > Lastly, the query has a limit and an order-by. The question is: Will > the result be first ordered and then the limit taken? The answer (I know > only for Postgresql and MySql, but I think it is the same for Sqlite, I don't think the documentation at https://www.sqlite.org/lang_select.html specifies, but it's the only thing that makes sense. Otherwise LIMIT would be useless as a pager. Still would be good to have official confirmation by the sqlite authors. > I haven't found any reference): The order is always executed first, but > it has to be stable. In this case it is, because we order by > Builds.rowid, which is a key. >Did this happen intentionally Of course. Otherwise LIMIT would be useless as a pager. --Sig_//O97PwsAArqo3puDT=ahYa2 Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAlvqDFMACgkQ5xo1VCww uqUF2Af+ON+MqnBrwCV5DCBIZXAyjE6Yf7+QrXCj1drSyY7ivgSFxTGCEqyxbl5G 5ifskB2vQP0rXLRszZ/ltgFlfQQhJT820a5na6Jd5uLH/3hxLlSUVbPHccM8N9zW iXkDGK+NDmztyrZMfyc4FpQMUyfbMs4+UNf/wfIsm10mQHRa7mSZxY/8T3bysot9 6iw2qmDr70+dJFY0SbB2jnMh1Js1BGn/FmTZNC8pXuuInuxnhTLNHAp4BxYoAqOX 3XrYsI2vz3+nLkqwJUS6cM6tpAzDWJeGBA2cm+R67tgR20nJyfiIzjjm5k36BjnQ zvCDXYF1cE6/DsehTUoIk8yUz+b60A== =IZ1h -----END PGP SIGNATURE----- --Sig_//O97PwsAArqo3puDT=ahYa2--