lubrito@posteo.net writes: > Hi, > > On Wed, 28 Apr 2021 21:00:44 +0100 > Christopher Baines wrote: > > >> I think time is the main thing, since that alone will help to identify >> which are the slow parts. > >> Indeed, simplicity is a good thing to keep in mind. > > I'll keep in mind what you have said, so here are some ideas to > break the tasks. > > Task 1: Add instrumentation to identify the slow parts of processing > new revisions > > - Implementing a chart over time to identify slow parts; Great, can you add more detail to this bit? Given the instrumentation is a really important part, it would be good to have some working ideas for what this chart might look like, and what goes in to making it (like where the data comes from and if and where it's stored). > - Select candidate parts to be analyzed; > - Should this part really be taking all this time? > - Identify if the delay is caused by the code itself or by a > query. > - Prepare a document organizing where each part should > be improved. > Given these don't contribute to the instrumentation, I would see them as part of "Task 2". > Task 2: Improve the performance of these slow parts > > - implementing each improvements. > > This Task 2 is really difficult for me to divide in smaller tasks, I'm > really stuck with ideas. I'd try to set out more of a strategy, so what might be causes of slowness, how would you investigate them, and what are common approaches to making those things faster? > I took a look at one of the logs as you told me, and I found this parts > taking a considerable time: Great :) > - Computing the channel derivations 636s > - Acquiring advisory transaction lock: > loading-new-guix-revision-inserts 2534s * > - Getting derivation lint warnings 975s > - fetching inferior lint warnings 1488s > - Acquiring advisory transaction lock: > loading-new-guix-revision-inserts 3226s * > - Querying the temp-package-metadata 1949s > - Fetching inferior package metadata 2476s > > To pick this parts is what should be facilitated by the chart? > So I would perform the analysis of the task 1 and then the ta I would hope that the "instrumentation" would make finding out what parts are slowest easier, and making it visual information rather than text scattered through the long log will hopefully go a long way towards that. > By the way, I noticed that the ones I marked with * are performed twice. > Why is that? I'd try reading the code to find out. In case you haven't worked with database locks much before, they are being used in this case to allow multiple processes to interact with the database to load different revisions at the same time. The locking is very coarse, so locks can be held for a very long time, so what you're seeing in the log when it's saying it's acquiring a lock, and it takes a long time, is that process was waiting for another process to commit or rollback a transaction in which it was holding that same lock. The technical details that are useful to know: - The locks here apply within transactions - The lock will be released when the transaction commits or rolls back - Advisory locks are user defined locks: https://www.postgresql.org/docs/13/explicit-locking.html#ADVISORY-LOCKS