From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:bcc0::]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id 4JKmJwGniWBNhAAAgWs5BA (envelope-from ) for ; Wed, 28 Apr 2021 20:18:41 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id +IJDIwGniWBwEwAAB5/wlQ (envelope-from ) for ; Wed, 28 Apr 2021 18:18:41 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id EFA83BD7A for ; Wed, 28 Apr 2021 20:18:40 +0200 (CEST) Received: from localhost ([::1]:45546 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lbolk-0003ZU-5a for larch@yhetil.org; Wed, 28 Apr 2021 14:18:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60326) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lbol5-0003Wj-Rz for guix-devel@gnu.org; Wed, 28 Apr 2021 14:17:59 -0400 Received: from mira.cbaines.net ([212.71.252.8]:36840) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lbol3-0004ZO-8u for guix-devel@gnu.org; Wed, 28 Apr 2021 14:17:59 -0400 Received: from localhost (unknown [IPv6:2a02:8010:68c1:0:8ac0:b4c7:f5c8:7caa]) by mira.cbaines.net (Postfix) with ESMTPSA id 4860B27BC7C; Wed, 28 Apr 2021 19:17:55 +0100 (BST) Received: from capella (localhost [127.0.0.1]) by localhost (OpenSMTPD) with ESMTP id 239310a4; Wed, 28 Apr 2021 18:17:54 +0000 (UTC) References: <20210428145941.4bd0dd6f@lubrito> User-agent: mu4e 1.4.15; emacs 27.1 From: Christopher Baines To: Luciana Lima Brito Subject: Re: Outreachy: Timeline tasks In-reply-to: <20210428145941.4bd0dd6f@lubrito> Date: Wed, 28 Apr 2021 19:17:51 +0100 Message-ID: <87y2d2e0j4.fsf@cbaines.net> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" Received-SPF: pass client-ip=212.71.252.8; envelope-from=mail@cbaines.net; helo=mira.cbaines.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: guix-devel@gnu.org Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1619633921; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post; bh=larbrB5plKeTk8INDnTQfE1UyjkyCKslQ2q90eJrtj8=; b=BFA7U+AO4mwny2sK9G1E+D3SHpJK7Uu1/H9JczErFRih9BSEKkWcKPe6HD1jODVs3V5Ckm h6lBis6q3Ib6ncPCAClGe7w6j7KcWsdnikpNTjK/1VdnyOsN6O46NZ9sUAAueBqsThXWbU DszPxH67EFeVSKEYOA4DtajM72hDeNzc8U8nd74HRMFbmD+uPjQk+rp3cypin1/yXl3c6D Z9q0UItuLo8MzndqUCq8DTYLEfMrY9c0E/b2j7P7Q7DwLbv2oDIJm+jG05/HJ77KnUKCxz MAXghCR7acM/c81OuDjf5pV++Zw20XSc0bF2HJIzncPPK0Aqa6kTya7VNYqFlQ== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1619633921; a=rsa-sha256; cv=none; b=P83aSSVuKQc0Sre2Vzh45v4x8lB4cnYeE07k+rJE5BujQ7yVGFSnfsVipfcFrp9o86iS5R 8MB6WPco4zrvxrQ0OsYD4w73kkIaBvq7Ty67Ig1cxZ7F7twToAHPGHcSj+PYS7H+E4Sxpg ytBA62U13fN2/prq6OskOF6GnTh3cQhrmoqwtnoKhxY7DubHef6TXnp7Mgmr4n2ucai7D/ 1nh2lvxZzozoPQk97EzIDs0sSbBPVa3aG/UAxdINK4SUSYppxDHp2x2pMp6TYk+FZAfzUn RgTX8p9y3yqgskiXYRevM/MZOqpoCgu4gDMBbjOQwKMvz1K5J5jnxeqvW27Oyw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Spam-Score: -4.55 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Queue-Id: EFA83BD7A X-Spam-Score: -4.55 X-Migadu-Scanner: scn0.migadu.com X-TUID: qmvneJRxT4TF --=-=-= Content-Type: text/plain Luciana Lima Brito writes: > I was thinking about the timeline of tasks. > > The main tasks are: > > 1. Add instrumentation to identify the slow parts of processing > new revisions > > 2. Improve the performance of these slow parts > > I'm writing some ideas I have to divide the tasks in small steps, see > what you think about it. > > About the first task I understand that the whole thing starts with > identifying how the data for new revisions arrives on Guix Data > Service: the relevant queries and their processing on the code. Based > on it I would propose start with mapping these queries and their uses, > so I could run them locally and get their statistics. > > Once I get this information I could identify which are the possible > problematic ones and work on them. If the process is slow but the query > is not, maybe the problem would be hidden in the code. So, there's already some code for timing different parts of the data loading process, if you look in the job output and search for ", took " you should see timings printed out. These timings being printed out does help, but having the information in the log doesn't make it easy to figure out which part is the slowest for example. I'd also not consider this a "one off" thing, the data loading code will continue to change as Guix changes and it's performance will probably change too. I've been wondering about visualisations, I remember systemd had a feature to plot the systems boot as a image which made seeing which parts are slow much easier (here's an example [1]). 1: https://lizards.opensuse.org/wp-content/uploads/2012/07/plot001.gif > About the improvements on the performance of slow parts, it is a little > bit abstract for me to see now how to break it in smaller tasks. I do > believe that it would require to reformulate some parts of the queries, > and as their result may change a bit, tweaks could be required on > the code too. My point is, how would I propose an improvement approach > if I don't even know what exactly is to be improved? But I imagine that > work on this second task is more demanding than the first and will take > most of the time of the internship. As I said before, this part is dependent on deciding where the areas for improvement are. Maybe have a look through one of the job logs on data.guix.gnu.org and see if you can spot some slow parts? --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQKlBAEBCgCPFiEEPonu50WOcg2XVOCyXiijOwuE9XcFAmCJps9fFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcRHG1haWxAY2Jh aW5lcy5uZXQACgkQXiijOwuE9XfUiw/+MKrdm+4nOkILlIJ1q9PHWFYBchd/thzq 1Pkw32FTGvbnXaqW+IgJR+RyhyCEIiMkVtNLEp9I5L+w8Ooz4XychJj9npchMYJT Im+iTWERGXRq20/ZeyhfwRWlLeKMqYK5h4ZNEr1zZVx2fY2gzT7JMGLdg1lfXDSZ vzZlQr1hrmyuvRlf6+w1nIUBsapezYLAlMqt1WPVGLho93x0u92/HWHckNS0zK+5 33C85aLCpierN/mDPYp7rM0MbqfQXYur+AqLdvJuWaIu6M7CSMsB1Bz6ttSkog7h IamhNefXbeJdWJQrYiRUdgSGlV/FBBgvYp2Ou2FreqD5LElGIMDGf1qUsQ3JzAco i6BrEJEi7P43t6tIc1xWU0jKpL9FNMO9HL0nnyTv3ZFPMwZJRV1MMDAHpaSTls/u 11a7crWq06jfu9lQ6HG39FWn+OVzhuvgYw1NYWzsZQauJVyhpDl0IzP1lb8OvD4f 1eQ1AjSfghY03jnST9wOqHlYb9L9bEJlSqZfVPsnC15g60NaRBG/KOggwyfb8vR9 9ivLP9okXm9TOwoRQf/R2Rg9N017i7jio3pC9G6goHVMKGx4Yp0keFyZueMRHWDA dqE0lZzJ0tbZ0qeFB9XRcG5RLqTaTFqRC44kxlJIjk9PQ0DUQY+B6uxofOD+SL0B mKkqYu0NFuY= =gCwz -----END PGP SIGNATURE----- --=-=-=--