From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) Subject: bug#32234: Cuirass: The SQLite built in busy handler might block the Fibers scheduler Date: Mon, 23 Jul 2018 11:18:43 +0200 Message-ID: <87bmaytimk.fsf@gnu.org> References: <87k1ponc62.fsf@lassieur.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:55009) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fhWzg-0002hW-DB for bug-guix@gnu.org; Mon, 23 Jul 2018 05:19:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fhWze-0007Ut-QL for bug-guix@gnu.org; Mon, 23 Jul 2018 05:19:04 -0400 Received: from debbugs.gnu.org ([208.118.235.43]:48057) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fhWze-0007Uj-L3 for bug-guix@gnu.org; Mon, 23 Jul 2018 05:19:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1fhWze-00027x-Cl for bug-guix@gnu.org; Mon, 23 Jul 2018 05:19:02 -0400 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <87k1ponc62.fsf@lassieur.org> ("=?UTF-8?Q?Cl=C3=A9ment?= Lassieur"'s message of "Sat, 21 Jul 2018 11:57:57 +0200") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: =?UTF-8?Q?Cl=C3=A9ment?= Lassieur Cc: 32234@debbugs.gnu.org Hi, Cl=C3=A9ment Lassieur skribis: > I'm trying to understand why Berlin web API times out[1]. Just to be clear, it does block on things involving non-trivial database queries: --8<---------------cut here---------------start------------->8--- ludo@berlin ~$ time wget -q -O - http://localhost:8081/specifications [{"name" : "core-updates-core-updates","load-path-inputs" : ["core-updates"= ],"package-path-inputs" : [],"proc-input" : "core-updates","proc-file" : "b= uild-aux/cuirass/gnu-system.scm","proc" : "cuirass-jobs","proc-args" : {"su= bset" : "core","systems" : ["x86_64-linux", "i686-linux"]},"inputs" : [{"na= me" : "core-updates","url" : "https://git.savannah.gnu.org/git/guix.git","l= oad-path" : ".","branch" : "core-updates","tag" : false,"commit" : false,"n= o-compile?" : true}]}, {"name" : "staging-staging","load-path-inputs" : ["s= taging"],"package-path-inputs" : [],"proc-input" : "staging","proc-file" : = "build-aux/cuirass/gnu-system.scm","proc" : "cuirass-jobs","proc-args" : {"= systems" : ["x86_64-linux", "i686-linux"]},"inputs" : [{"name" : "staging",= "url" : "https://git.savannah.gnu.org/git/guix.git","load-path" : ".","bran= ch" : "staging","tag" : false,"commit" : false,"no-compile?" : true}]}, {"n= ame" : "guix-modular-master","load-path-inputs" : ["guix-modular"],"package= -path-inputs" : [],"proc-input" : "guix-modular","proc-file" : "build-aux/c= uirass/guix-modular.scm","proc" : "cuirass-jobs","proc-args" : {"systems" := ["x86_64-linux", "i686-linux"]},"inputs" : [{"name" : "guix-modular","url"= : "https://git.savannah.gnu.org/git/guix.git","load-path" : ".","branch" := "master","tag" : false,"commit" : false,"no-compile?" : true}]}, {"name" := "guix-master","load-path-inputs" : ["guix"],"package-path-inputs" : [],"pr= oc-input" : "guix","proc-file" : "build-aux/cuirass/gnu-system.scm","proc" = : "cuirass-jobs","proc-args" : {"subset" : "all","systems" : ["x86_64-linux= ", "i686-linux"]},"inputs" : [{"name" : "guix","url" : "https://git.savanna= h.gnu.org/git/guix.git","load-path" : ".","branch" : "master","tag" : false= ,"commit" : false,"no-compile?" : true}]}] real 0m0.007s user 0m0.004s sys 0m0.000s ludo@berlin ~$ time wget -q -O - http://localhost:8081/api/latestbuilds?nr= =3D2 [{"id" : 4484197,"jobset" : "core-updates-core-updates","job" : "ghc-alex-3= .2.3.i686-linux","timestamp" : 1532336715,"starttime" : 1532336542,"stoptim= e" : 1532336715,"derivation" : "/gnu/store/b072sfmw44ww9dsm1ahpqafjk05zsl04= -ghc-alex-3.2.3.drv","buildoutputs" : {"out" : {"path" : "/gnu/store/qk79zl= 1gcg6lh4k2nrp72x8iywy317d2-ghc-alex-3.2.3"}},"system" : "i686-linux","nixna= me" : "ghc-alex-3.2.3","buildstatus" : 0,"busy" : 0,"priority" : 0,"finishe= d" : 1,"buildproducts" : null,"releasename" : null,"buildinputs_builds" : n= ull}, {"id" : 4799156,"jobset" : "core-updates-core-updates","job" : "ghc-a= lex-3.2.3.i686-linux","timestamp" : 1532336715,"starttime" : 1532336542,"st= optime" : 1532336715,"derivation" : "/gnu/store/b072sfmw44ww9dsm1ahpqafjk05= zsl04-ghc-alex-3.2.3.drv","buildoutputs" : {"out" : {"path" : "/gnu/store/q= k79zl1gcg6lh4k2nrp72x8iywy317d2-ghc-alex-3.2.3"}},"system" : "i686-linux","= nixname" : "ghc-alex-3.2.3","buildstatus" : 0,"busy" : 0,"priority" : 0,"fi= nished" : 1,"buildproducts" : null,"releasename" : null,"buildinputs_builds= " : null}] real 0m28.033s user 0m0.004s sys 0m0.000s --8<---------------cut here---------------end--------------->8--- I=E2=80=99m not sure it=E2=80=99s due to locking though; we may lack an ind= ex or two, =E2=80=9Cexplain query plan=E2=80=9D should help. > I think the SQlite built in busy handler may block the Fibers scheduler. > We use "PRAGMA busy_timeout =3D 30000;", which is an alternative to > calling sqlite3_busy_timeout(), whose description[2] is: > > This routine sets a busy handler that sleeps for a specified amount > of time when a table is locked. The handler will sleep multiple > times until at least "ms" milliseconds of sleeping have > accumulated. After at least "ms" milliseconds of sleeping, the > handler returns 0 which causes sqlite3_step() to return SQLITE_BUSY. > > To me this sounds like non-cooperative and non-resumable code. Indeed! > A solution would be to set a custom handler through > sqlite3_busy_handler[3] that would be Fibers compatible, i.e. it would > let the scheduler schedule other fibers instead of just sleeping, using > Fibers 'sleep' procedure[4]. AIUI the handler is called from C, and thus it=E2=80=99s a non-resumable continuation, so this wouldn=E2=80=99t work. Perhaps instead we need to set the timeout to a small value and handle SQLITE_BUSY at the call site in our code. We could define a macro that automatically retries upon SQLITE_BUSY. Thoughts? Ludo=E2=80=99.