From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0.migadu.com ([2001:41d0:403:4876::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms13.migadu.com with LMTPS id UBGHJuq9L2dZGwAAqHPOHw:P1 (envelope-from ) for ; Sat, 09 Nov 2024 19:54:18 +0000 Received: from aspmx1.migadu.com ([2001:41d0:403:4876::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0.migadu.com with LMTPS id UBGHJuq9L2dZGwAAqHPOHw (envelope-from ) for ; Sat, 09 Nov 2024 20:54:18 +0100 X-Envelope-To: larch@yhetil.org Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" ARC-Seal: i=1; s=key1; d=yhetil.org; t=1731182058; a=rsa-sha256; cv=none; b=CIufM0qCjGEpX4SC9QqFcpmu6T39UgXrIRcIKcscDXJkMSvugmolFTkEXVdlIz3884CvQc dvHW6aWIl+9UDywrfWeq5DtogBypf8fOVogaQIff6Jfnw8uARLQrkDTqSF14vYVippuCjw IpcEKPYkrjWAx0szZujSPcdp+zCsSnLA565iZoF09H0HzUnabknB/0tTFi2rC0f4qoKhd5 XcssP9/1ardLWzdyUGhATZJgmJlYdnercX8rCft8o9UKtUBshj0wTEkkSNZRAeSV6xtpvZ UxmP6WRT7FenyHMqniRZwDd/ljKAdizb5542t4i7DDSad+UAwnn9jDt/XA07vg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1731182058; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:list-id:list-help:list-unsubscribe: list-subscribe:list-post; bh=oKhO5s7z1J7xp553RJu2HhhD1SdRgEHnxS7/N8f1aK0=; b=L63f8g0AYxE1eaBTNtd+bafqMfuAU04csf9kM+NjmiqzXIBPA58BJwrMqMNhijKz0j4cC5 YDIB2BfQaSbJ1YuV6e8iDcFSUpzk7zZAELHnD2Kcz/o/wohPNwaCjPf9mNsyGfun73w/Xu gWWYEvGZXl0iLOLcaUOxmVRIpktoUeAHKbHwtp5/T7vO5dq2VzxDph5XXThnmguX7wMoPg uAqiXG1XJSdrYn6k+StNO2zWqLWMLpFKkWxqC7Zl9Dc/6XA2xoytS1aXC/UvAcGablYgTz 8gliJpNrZ3BFv8TciQ1U58aS6YDWteV5nwGamDANRtWcBW77kAbPhsmUO6S9Pw== Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 2CF6B7C165 for ; Sat, 09 Nov 2024 20:54:17 +0100 (CET) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1t9rWQ-0000Wh-3Q; Sat, 09 Nov 2024 14:53:26 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1t9rWN-0000WC-UC for guix-devel@gnu.org; Sat, 09 Nov 2024 14:53:24 -0500 Received: from mira.cbaines.net ([2a01:7e00:e000:2f8:fd4d:b5c7:13fb:3d27]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1t9rWL-0003LF-Tz for guix-devel@gnu.org; Sat, 09 Nov 2024 14:53:23 -0500 Received: from localhost (unknown [IPv6:2a02:6b67:e390:8b00::1ce5]) by mira.cbaines.net (Postfix) with ESMTPSA id D2B2F27BBE2 for ; Sat, 9 Nov 2024 19:53:16 +0000 (GMT) Received: from fang (localhost [127.0.0.1]) by localhost (OpenSMTPD) with ESMTP id 2156d6cc for ; Sat, 9 Nov 2024 19:53:16 +0000 (UTC) From: Christopher Baines To: Guix Devel Subject: Persistent heap usage when computing derivations Date: Sat, 09 Nov 2024 19:53:13 +0000 Message-ID: <87a5e8tc9i.fsf@cbaines.net> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" Received-SPF: pass client-ip=2a01:7e00:e000:2f8:fd4d:b5c7:13fb:3d27; envelope-from=mail@cbaines.net; helo=mira.cbaines.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN X-Migadu-Spam-Score: -5.46 X-Spam-Score: -5.46 X-Migadu-Queue-Id: 2CF6B7C165 X-Migadu-Scanner: mx10.migadu.com X-TUID: flamcyR2EN0S --=-=-= Content-Type: text/plain Hey, I've been putting some more time and money in to trying to get the QA data service (data.qa.guix.gnu.org) to perform better recently, but unfortunately I haven't been having much success. I've been trying to parallelise more and while I think this should speed things up, butI'm having to reduce the actual parallelism due to lack of memory (the machine I rent for data.qa.guix.gnu.org just has 32G). One of the memory problems I'm having relates to the Guix inferior processes that the data service uses when computing derivations. The data serivce goes through the list of systems (x86_64-linux, aarch64-linux, ...) and because the data cached for x86_64-linux probably doesn't relate to aarch64-linux, there's some code that attempts to clear the caches [1]. 1: https://git.savannah.gnu.org/cgit/guix/data-service.git/tree/guix-data-service/jobs/load-new-guix-revision.scm#n1970 Unfortunately this code has to reach in to Guix internals to try and do this, and it does reduce the heap usage significantly, but this doesn't result in stable memory usage. Each system processed seems to add about 250MiB of data to the Guile heap that isn't cleared out. To me that sounds like a lot of memory, but there's also a lot of systems/targets, so overall this leads to the inferior process using with around 6GiB of data in the heap after processing all the systems/targets. This peak memory usage really limits how much the machine can do. These numbers come from this specific job that ran with a parallelism of 1 to get clear data [2]. 2: https://data.qa.guix.gnu.org/job/60896 I've tried using the heap profiler that Ludo wrote, but nothing jumps out at me about what this extra 250MiB of stuff in the heap relates to. I'm also aware that my current cache cleanup doesn't actually remove references to the hash tables themselves, but I doubt they take up this much space. Does anyone have any suggestions as to what might be taking up this space on the heap, or how to try and find out? Thanks, Chris --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQKlBAEBCgCPFiEEPonu50WOcg2XVOCyXiijOwuE9XcFAmcvvalfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcRHG1haWxAY2Jh aW5lcy5uZXQACgkQXiijOwuE9XcWIhAAlDj27PXKoTqR/NO0Ts70vfh5MVRrAA0w 0grH8Qf5MoVyHRxRGDtgxbGJJhaCouJAP2XZkJ6iN3bHxB49K76P+5Eb63p4fvOe dtnUWd5u7hFq/Qb3yvZg7GuRLvRprJCJOwkmz2mtStN3V3pQwk0062ipj+fH4TgI Gi72UiLERblyDNcTUeDKbw+djxN0kAPI1EZPoSWBby3LOfIR35sgEAkLIX7sbCJg zvhnS0qLKin5xf3dJ+BsWvSsjVNqAX1IRvVKoDC3C+Vg7B5DU2UBdFC4pPBoZtiu RNd01S95OcLI9sUtPmX/vJmhGaJnlj5abPSM5EQloehAXsrnZXqhivBxv3JTRZop J4AatdY2iI/QvBjHoZbmBa3bAWTDu8652+o0IBXdL/3VMW/lcxAynVUlNTG3inCd I9owMpZ10x3c7xgqMkEPGhhgGaa8CQFsq3skRGhTOOQt67Kk9T5McVYmTsOXnxNp lZf+SYdUeJ/ErcZkKQepNybK2ik/tynn4VlES7RdZS/nmFULl+Um/RvOt/qwlKpa 8XxsiWSsaoiPF7/i3wH8yGle3C8qu5+lCYFi22fCx8Q8WMaxfPVS7Sd79dds3L3m WgxEccqjsH/aFWGsaDVJTXPuYI6woUk4tCVNEbGSzQ7+H3mSf3RPQBcxNWbPTxRT MwP6755ANs8= =ENcL -----END PGP SIGNATURE----- --=-=-=--