all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Danny Milosavljevic <dannym@scratchpost.org>
To: "Björn Höfling" <bjoern.hoefling@bjoernhoefling.de>
Cc: guix-devel@gnu.org, clement@lassieur.org
Subject: Re: [Cuirass] Missing database indexes?
Date: Tue, 13 Nov 2018 00:27:15 +0100	[thread overview]
Message-ID: <20181113002715.33a00ea1@scratchpost.org> (raw)
In-Reply-To: <20181112195044.6d64f51c@alma-ubu>

[-- Attachment #1: Type: text/plain, Size: 3358 bytes --]

Hi Björn,

On Mon, 12 Nov 2018 19:50:44 +0100
Björn Höfling <bjoern.hoefling@bjoernhoefling.de> wrote:

> Hm. This code smells ... It looks too complicated.

I was trying to cut down the number of prepared statements in use and prevent a
combinatorial explosion while keeping the kinds of queries we can do open.

Either the value of a parameter is specified, in which case the associated column
is filtered for it; or it isn't, then it's not.

> So, even when we have a constant part(23=23) in the OR clause, this
> leads to a full table scan. I think the optimizer cannot detect the
> fact that it is a constant boolean value. 

Sounds like an easy fix in sqlite.  Could you report this upstream?

> I double-checked with Postgresql and it is also performing a full table
> scan in the "boolean-constant OR :id=id" case. I could not find any
> references on the net about it.

Something easy to try would be to use the row values in sqlite instead.

See also https://www.sqlite.org/rowvalue.html
See also https://lists.gnu.org/archive/html/guix-devel/2018-07/msg00101.html

> When this would be Java/JPA I would suggest to dynamically create the
> query. Can we do something in Scheme-DB too? I.e. pseudo-code

(1) Combinatorial explosion of the number of prepared statements
(2) Slow to parse all those SQL statements

But if we don't use all combinations in practise then it's not so bad
and we could generate those statements after all.  It's still a
workaround if we have to do that.

Then we'd have to make sure that the user can't specify arbitrary
combinations and/or limit the number of prepared statements that
exist at the same time.  Cuirass is a web-facing service so these
are not theoretical problems - someone *will* break it (maybe on
purpose) if it's possible.

All in all, generating these SQL statements is like generating a
program source code and recompiling it every time you want to change
the filter used.  It might be the only good solution in this case
(maybe) - but in general it's an antipattern.

> We should add a column:
> 
> id INTEGER PRIMARY KEY AUTOINCREMENT
> 
> Problem is that this concept of AUTOINCREMENT does only work for
> Primary Keys in Sqlite. So we need to degrade "derivation" to a
> secondary key, i.e. make it non-null and unique:
> 
> derivation    TEXT NOT NULL UNIQUE,
> 
> Is there anything speaking against that?

Sounds good.  Note that when you use autoincrement, you eventually
have to handle the case when the value overflows.  The window of
used IDs slowly creeps up over the months.

> Lastly, the query has a limit and an order-by. The question is: Will
> the result be first ordered and then the limit taken? The answer (I know
> only for Postgresql and MySql, but I think it is the same for Sqlite,

I don't think the documentation at https://www.sqlite.org/lang_select.html
specifies, but it's the only thing that makes sense.  Otherwise LIMIT would
be useless as a pager.  Still would be good to have official confirmation
by the sqlite authors.

> I haven't found any reference): The order is always executed first, but
> it has to be stable. In this case it is, because we order by
> Builds.rowid, which is a key.

>Did this happen intentionally

Of course.  Otherwise LIMIT would be useless as a pager.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  parent reply	other threads:[~2018-11-12 23:27 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-10 17:33 [Cuirass] Missing database indexes? Ludovic Courtès
2018-11-10 20:11 ` Björn Höfling
2018-11-11 17:06   ` Ludovic Courtès
2018-11-12 18:50     ` Björn Höfling
2018-11-12 19:42       ` Amirouche Boubekki
2018-11-12 23:27       ` Danny Milosavljevic [this message]
2018-11-14 11:14         ` Ludovic Courtès
2018-11-16 22:31         ` Björn Höfling
2018-11-13  8:10       ` Clément Lassieur
2018-11-16 22:42         ` Björn Höfling
2018-11-12 23:31     ` Danny Milosavljevic
2018-11-13  0:04       ` Danny Milosavljevic
2018-11-14 11:11       ` Ludovic Courtès
2018-11-19 10:44         ` Danny Milosavljevic
2018-12-19 22:45           ` Amirouche Boubekki
2018-11-19  9:47 ` swedebugia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181113002715.33a00ea1@scratchpost.org \
    --to=dannym@scratchpost.org \
    --cc=bjoern.hoefling@bjoernhoefling.de \
    --cc=clement@lassieur.org \
    --cc=guix-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.