all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: ludo@gnu.org (Ludovic Courtès)
To: Danny Milosavljevic <dannym@scratchpost.org>
Cc: guix-devel <guix-devel@gnu.org>
Subject: Re: Cuirass news
Date: Fri, 09 Feb 2018 10:41:13 +0100	[thread overview]
Message-ID: <87wozm4i12.fsf@gnu.org> (raw)
In-Reply-To: <20180209000526.5b9ea8e7@scratchpost.org> (Danny Milosavljevic's message of "Fri, 9 Feb 2018 00:05:26 +0100")

Hello,

Danny Milosavljevic <dannym@scratchpost.org> skribis:

> On Thu, 08 Feb 2018 23:21:58 +0100
> ludo@gnu.org (Ludovic Courtès) wrote:
>
>> > from a security standpoint - except for db-get-builds, which I'm amending
>> > right now.  
>> 
>> Oh sorry, I think I did the same thing as you were sending this message:
>> 
>>   https://git.savannah.gnu.org/cgit/guix/guix-cuirass.git/commit/?id=8c7c93922bbe0513ff4c4ff3a6e554e3a72635b6
>
>> WDYT?
>
> I'd prefer not to have so many different SQL statements, we get a
> combinatorial explosion if we aren't careful (whether we cache or not,
> the relational database management system is going to hate us anyway
> when we do that).
>
> But I guess there are not that many yet.
>
> If we are fine in not being able to search for literal NULL we can use NULL as
> "anything" marker and have a static WHERE statement (this way is customary).
>
> Also, I've asked on the sqlite mailing list - ORDER BY cannot support "?", so
> those are unavoidable (also, we can't usefully do the ORDER BY ourselves
> by sorting the result - because of the LIMIT clause)
>
> Anyway as long as we are under 10000 statements it should be fine :P

Yes.  Also, in practice, everyone’s going to make the same /api/*
requests (because there are only two clients, the Emacs and the Web UI,
and they typically always do the same requests), which in turn means
we’ll always get the same ‘db-get-builds’ call, possibly with just a
different limit, but it’s still the same statement.

So I think we should be fine.

>> Indeed!  Should we change ‘sqlite-finalize’ to a noop when called on a
>> cached statement?  (Otherwise users would have to keep track of whether
>> or not a statement is cached.)
>
> Hmm maybe that's a good way.  But its a little magic.

Yes, but I think we have no other option: now that caching is built into
sqlite3.scm, it has to be properly handled by all of that module.  For
the user, it should be a simple matter of choosing #:cache? #t
or #:cache? #f, and then (sqlite3) should just DTRT.

Otherwise we’d have to remove caching from (sqlite3) altogether, IMO.

WDYT?

>> Besides, on the big database on berlin, the initial:
>> 
>>   (db-get-builds db '((status pending)))
>> 
>> call takes a lot of time and memory.  I guess we’re doing something
>> wrong, but I’m not sure what.  The same query in the ‘sqlite3’ CLI is
>> snappy and does not consume much memory.
>
> WTF.  I'll have a look.

Great.  :-)

For the record, this is the GC profile I get (take it with a grain of
salt: it tells us what part of the code allocates, not what the live
objects are):

--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> (define lst (gcprof (lambda () (db-get-builds db '((status pending))))))
%     cumulative   self
time   seconds     seconds  procedure
 33.82    468.32    468.32  bytevector->pointer
 22.38   1361.15    309.97  cuirass/database.scm:347:0:db-format-build
 11.19    154.98    154.98  hashq-set!
  6.33     87.60     87.60  make-bytevector
  4.62     64.01     64.01  utf8->string
  3.89   1051.19     53.91  cuirass/database.scm:324:0:db-get-outputs
  2.92     40.43     40.43  apply-smob/1
  2.68     37.06     37.06  dereference-pointer
  2.43     33.69     33.69  anon #x28e3088
  1.46   1010.76     20.22  cuirass/database.scm:55:0:%sqlite-exec
  1.46     20.22     20.22  srfi/srfi-1.scm:269:0:iota
  1.22     47.17     16.85  ice-9/boot-9.scm:789:2:catch
  1.22     16.85     16.85  ice-9/boot-9.scm:777:2:throw
  1.22     16.85     16.85  string->utf8
  0.97     13.48     13.48  ice-9/boot-9.scm:701:2:make-prompt-tag
  0.73   1384.74     10.11  cuirass/database.scm:375:0:db-get-builds
  0.73     10.11     10.11  hash-set!
  0.49      6.74      6.74  cons
  0.24      3.37      3.37  pointer->bytevector
  0.00 4462655.56      0.00  /gnu/store/cxxyk9bdas4n7m6zlhdhnm7ixxkw3b0b-profile/share/guile/site/2.2/sqlite3.scm:510:2:lp
  0.00    815.34      0.00  /gnu/store/cxxyk9bdas4n7m6zlhdhnm7ixxkw3b0b-profile/share/guile/site/2.2/sqlite3.scm:311:0:sqlite-prepare
  0.00    805.24      0.00  /gnu/store/cxxyk9bdas4n7m6zlhdhnm7ixxkw3b0b-profile/share/guile/site/2.2/sqlite3.scm:286:4
  0.00    101.08      0.00  /gnu/store/cxxyk9bdas4n7m6zlhdhnm7ixxkw3b0b-profile/share/guile/site/2.2/sqlite3.scm:474:0:sqlite-row
  0.00     47.17      0.00  /gnu/store/cxxyk9bdas4n7m6zlhdhnm7ixxkw3b0b-profile/share/guile/site/2.2/sqlite3.scm:223:2:sqlite-remove-statement!
  0.00     47.17      0.00  /gnu/store/cxxyk9bdas4n7m6zlhdhnm7ixxkw3b0b-profile/share/guile/site/2.2/sqlite3.scm:241:4
  0.00     37.06      0.00  /gnu/store/cxxyk9bdas4n7m6zlhdhnm7ixxkw3b0b-profile/share/guile/site/2.2/sqlite3.scm:444:4
  0.00     16.85      0.00  /gnu/store/cxxyk9bdas4n7m6zlhdhnm7ixxkw3b0b-profile/share/guile/site/2.2/sqlite3.scm:227:22
  0.00     16.85      0.00  hash-for-each
---
Sample count: 411
Total time: 1384.735520913 seconds (1331.66193132 seconds in GC)
--8<---------------cut here---------------end--------------->8---

This query takes ~5 seconds with the CLI (which is still way too much):

--8<---------------cut here---------------start------------->8---
sqlite> select count(*) from builds inner join Derivations ON Builds.derivation = Derivations.derivation and Builds.evaluation = Derivations.evaluation
   ...> INNER JOIN Evaluations ON Derivations.evaluation = Evaluations.id
   ...> INNER JOIN Specifications ON Evaluations.specification = Specifications.repo_name;
2156003
--8<---------------cut here---------------end--------------->8---

>> One of the things we’re doing wrong is that ‘Outputs’ table: each
>> ‘db-format-build’ call triggers a lookup in that table.  We should
>> instead probably simply store output lists in the ‘Derivations’ table.
>
> Definitely.  That's one of the things we should inline into db-get-builds.
> Relational databases are good at joins, let's not to their work for them.

Right.

>> > I've also reintroduced sqlite-bind-args in a nicer version, please pull:
>> > https://notabug.org/civodul/guile-sqlite3/pulls/3 .  
>> 
>> It is OK with you to write it like this:
>
> Yes, looks good!  Thanks!

Pushed!

Thanks,
Ludo’.

  parent reply	other threads:[~2018-02-09  9:41 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-24 22:12 Cuirass news Ludovic Courtès
2018-01-25 10:55 ` Mathieu Othacehe
2018-01-25 10:59   ` Mathieu Othacehe
2018-01-25 13:09     ` Ludovic Courtès
2018-01-26 14:30       ` Danny Milosavljevic
2018-01-27 16:01         ` Ludovic Courtès
2018-01-27 17:18           ` Danny Milosavljevic
2018-01-27 19:12             ` Danny Milosavljevic
2018-01-28 21:47             ` Ludovic Courtès
2018-01-28 22:23               ` Danny Milosavljevic
2018-01-29  9:57               ` Andy Wingo
2018-02-08 13:37             ` Ludovic Courtès
2018-02-08 16:29               ` Danny Milosavljevic
2018-02-08 22:21                 ` Ludovic Courtès
2018-02-08 23:05                   ` Danny Milosavljevic
2018-02-09  6:17                     ` Gábor Boskovits
2018-02-09  9:41                     ` Ludovic Courtès [this message]
2018-02-09 11:29                       ` Danny Milosavljevic
2018-02-09 16:53                         ` Ludovic Courtès
2018-02-09 17:06                           ` Danny Milosavljevic
2018-02-10 11:18                             ` Ludovic Courtès
2018-02-13  9:12                               ` Danny Milosavljevic
2018-02-14 13:43                                 ` Ludovic Courtès
2018-02-14 23:17                                   ` Ludovic Courtès
2018-02-19 15:35                                     ` Danny Milosavljevic
2018-02-19 15:35                                       ` [PATCH] database: Simplify 'db-get-builds' Danny Milosavljevic
2018-02-19 17:52                                       ` [PATCH] database: db-get-builds: Inline output selection Danny Milosavljevic
2018-02-19 22:08                                       ` Cuirass news Danny Milosavljevic
2018-03-02 13:21                                         ` Ludovic Courtès
2018-03-02 22:06                                           ` Ludovic Courtès
2018-03-02 23:29                                           ` Danny Milosavljevic
2018-02-14 23:21                                   ` Ludovic Courtès
2018-01-25 21:06 ` Ricardo Wurmus
2018-01-26 11:12   ` Ludovic Courtès
2018-01-25 22:28 ` Danny Milosavljevic
2018-01-26 10:47   ` Ludovic Courtès
2018-01-28 12:33     ` Cuirass frontend Danny Milosavljevic
2018-01-29 17:42       ` Ludovic Courtès
2018-01-26  0:46 ` Cuirass news Danny Milosavljevic
2018-01-27 17:27   ` Danny Milosavljevic
2018-01-28 21:48     ` Ludovic Courtès
2018-01-26 17:55 ` Jan Nieuwenhuizen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wozm4i12.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=dannym@scratchpost.org \
    --cc=guix-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.