From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43537) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gK7gw-0008Hc-Ct for guix-patches@gnu.org; Tue, 06 Nov 2018 15:11:17 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gK7gn-0008Vg-FG for guix-patches@gnu.org; Tue, 06 Nov 2018 15:11:09 -0500 Received: from debbugs.gnu.org ([208.118.235.43]:33580) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gK7gk-0008JE-RE for guix-patches@gnu.org; Tue, 06 Nov 2018 15:11:03 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1gK7gk-0007n0-G0 for guix-patches@gnu.org; Tue, 06 Nov 2018 15:11:02 -0500 Subject: [bug#33210] Cuirass: Use a SQLite in single-thread mode Resent-Message-ID: Date: Tue, 6 Nov 2018 21:10:49 +0100 From: Danny Milosavljevic Message-ID: <20181106211049.1469331b@scratchpost.org> In-Reply-To: <875zxa5mfn.fsf@lassieur.org> References: <87a7mvqikl.fsf@lassieur.org> <871s80o2zc.fsf@gnu.org> <87ftwgq7da.fsf@lassieur.org> <20181106011154.3f235763@scratchpost.org> <87efbzqbb0.fsf@lassieur.org> <20181106122036.25bad548@scratchpost.org> <875zxa5mfn.fsf@lassieur.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; boundary="Sig_/wcdjKIw1cVlVAXsyYdXyGHn"; protocol="application/pgp-signature" List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+kyle=kyleam.com@gnu.org Sender: "Guix-patches" To: =?UTF-8?Q?Cl=C3=A9ment?= Lassieur Cc: 33210@debbugs.gnu.org --Sig_/wcdjKIw1cVlVAXsyYdXyGHn Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi Cl=C3=A9ment, > rather basic way. But if I understand correctly, the overall spent time > is more or less the same: only the order of the requests differs. Yeah, right now users can query something using the web interface while a build is updating (or running) at the cost of the returned data being potentially very strange. The "one-line fix" would make it worse in that users cannot query while a build is running, making them wait until the build is done (approx. 30 min) before the query succeeds. The upside is that the returned data is consistent at all times. This is how DBMSes did it in the 90s, too. What I'd like to eventually have is the proper fix where users can query while a build is running, *and* the build doesn't have to wait either. This works just fine using two transactions with WAL mode of sqlite, which means it uses MVCC in order to keep both "world views", one for the querier and one for the builder (easily extended to an arbitrary number of queriers and builders at once by just having more transactions) while they are both using "the world". > is an essential difference however: if we take care of the scheduling, > we won't have SQLITE_BUSY blocking the Fibers scheduler all the time. > And blocking the Fibers scheduler has an impact on all other possibly > unrelated Fibers clients. Right. I just wanted to make sure we understand the possible implications = here. In the end I'm not sure we even need multithreading even for my scenario - maybe (probably) just having an extra sqlite_open would be enough, threads or not. On the other hand there are shared caches etc and this change here could cause some very tricky problems then. I have to say I liked the external evaluator much more since then all this complexity would be contained in the external program and it would just magically work without special-casing any of this stuff. > When guile-sqlite3 handles SQLITE_BUSY > correctly, I'll be happy to switch back to a multi-threading SQLite. > While waiting for it, I believe our users want a fast Cuirass, and I'd > like the time spent in the Fibers scheduler to be balanced by removing > the SQLite now useless mutexes. That makes sense. It's difficult for guile-sqlite3 to handle SQLITE_BUSY correctly since sqlite also uses SQLITE_BUSY to indicate errors that you are supposed to fail on. In the non-presence of a busy handler, it's not possible to distinguish whether the SQLITE_BUSY was of the "please retry" kind or of the "don't you retry" kind. It would mean that guile-sqlite3 would have to have its own flag that indicates whether the busy handler was called, and check this one. Resetting this flag would also have to be potentially thread-safe (for other users of guile-sqlite3). That's always assuming that sqlite3 undos whatever it was trying to do before returning SQLITE_BUSY so it actually makes sense to retry the call later. So something like this: guile_sqlite_handle_busy(...) { guile_struct->busy_handler_called =3D true; return 0; // fail } guile_sqlite_open { int rc =3D native_sqlite_open(...); native_sqlite_set_busy_handler(..., guile_sqlite_handle_busy); // FIXME: check for errors here and fail on error guile_struct->busy_handler_called =3D false; } guile_sqlite_method { int rc, busy_handler_called; do { rc =3D native_sqlite_method(...); } while (rc =3D=3D SQLITE_BUSY && (busy_handler_called =3D test-and-reset= (guile_struct->busy_handler_called), yield)); return rc; } Hmmmmmmmm. I think that can be done. Notes for myself: pager.c busyHandler btreeInvokeBusyHandler sqlite3BtreeBeginTrans sqlite3PagerSetBusyhandler SQLITE_FCNTL_BUSYHANDLER=20 --Sig_/wcdjKIw1cVlVAXsyYdXyGHn Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAlvh9UkACgkQ5xo1VCww uqUe9Af/YUIZmFoDif6PysDEoFvsc7ZkyOCSAfoboENtcif3JHTzIDkuf0zXwZUP IsPQR5SHzj4jJXjfhex1QPDAK3YB16GmCwkVPfXAfj694fbJz560MGt6kzajMyv2 clvUZRO9Xdb9bv/j4+V39U52Xy7UQeX8pGLIp8LCkZQYpcU4Ho/J0f5AVe2mABhR 1iPLTWgK15B4cvu2ztY7mV1lGI06uF8TCAFcl9yxWUO3Bc3NBxJ3JgXEk/5B60ml Grd6dRaoe8fpVx8K50KjAMO5qIGY2HITNpPfK7b57yPVMmsJ5wzQ+ApueAluviNa EaLiBac6L/0TiF+srMSkUofqpLytTg== =Gwg+ -----END PGP SIGNATURE----- --Sig_/wcdjKIw1cVlVAXsyYdXyGHn--