From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2.migadu.com ([2001:41d0:700:3204::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms8.migadu.com with LMTPS id aDHDG/2mnmWfrwAAkFu2QA (envelope-from ) for ; Wed, 10 Jan 2024 15:17:33 +0100 Received: from aspmx1.migadu.com ([2001:41d0:303:e224::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2.migadu.com with LMTPS id SG+mFf2mnmXD8QAAe85BDQ (envelope-from ) for ; Wed, 10 Jan 2024 15:17:33 +0100 X-Envelope-To: larch@yhetil.org Authentication-Results: aspmx1.migadu.com; dkim=none; spf=pass (aspmx1.migadu.com: domain of "guix-patches-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-patches-bounces+larch=yhetil.org@gnu.org"; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1704896253; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding:resent-cc: resent-from:resent-sender:resent-message-id:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post; bh=Btl6c8r2rMs8LMHVH7csJEAZAtkFOwc1rNj2JGCuTtk=; b=drRQSknyzPwTni66o0Gf9Q3eNh0aRGxnHAD3NzAJ+rfYzERxlkQgGJjuWw9ofmp2pm9Wad P1K/hD+4JpqK+ZzyVLnMMZ83RBZ/u6ayYCv/sRNH+fOeKU/huwfcRXHgQJHyJMo2u+9EQg r7260qa2QvohfricOWbOsMQWQKrp1LFv+VE1hoAy21GYqr7q9EEI4HO5qUo7iJeO9WVBSC bR/2tSiW8Th62VwaqtbnCCN/eA4js9OGlQy+oeNkud/CUsjYu48EomBW2Y8LSGb0OETQRO DvDOePAIyE/ceYzk9SRGUiD3BHHC7ny7GpkHqoDwW1ErFnllmEWRGAZ6Y6CBcA== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1704896253; a=rsa-sha256; cv=none; b=lABP/bLKTwS+EQaA++/gQOR6tr6aoj7Vp8dt0kXJXlQUIbKkwb6WrD0irUODPqtwj/9JXZ T/pMwYo1UA/IlPKqRFD6NeI/vtkWFQj3zh2oJ28AAexBMhISRsWx066h7JChwuYf8UnRln +pg9/VU2ti65tKna1uUmAGyIc2x/IB6SRUG/xA0zxR0g1KHvPbpm8w2QkcfAT4JlDI0Z7H sIB0y6nHuELtpNCnM9nNAe4XcEuPn4Ghnub30/7tiRJG0RXsrhOCQE7Rj9ZC6mD8Cl6Ls/ yXWZT6SvmWgAwFnSEYH3lG5QJNXeHpDnZaUG8+f1nMVLabid8nIXQrWcIUjZMw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; spf=pass (aspmx1.migadu.com: domain of "guix-patches-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-patches-bounces+larch=yhetil.org@gnu.org"; dmarc=none Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 3876A1095B for ; Wed, 10 Jan 2024 15:17:32 +0100 (CET) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rNZOO-0001F7-Ut; Wed, 10 Jan 2024 09:17:17 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rNZOD-0001En-N9 for guix-patches@gnu.org; Wed, 10 Jan 2024 09:17:07 -0500 Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rNZOC-0000q2-M9; Wed, 10 Jan 2024 09:17:05 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1rNZO9-0000qU-VD; Wed, 10 Jan 2024 09:17:01 -0500 X-Loop: help-debbugs@gnu.org Subject: [bug#68266] [PATCH v2] guix: store: Add report-object-cache-duplication. References: <87plyfrb2x.fsf@cbaines.net> In-Reply-To: <87plyfrb2x.fsf@cbaines.net> Resent-From: Christopher Baines Original-Sender: "Debbugs-submit" Resent-CC: guix@cbaines.net, dev@jpoiret.xyz, ludo@gnu.org, othacehe@gnu.org, rekado@elephly.net, zimon.toutoune@gmail.com, me@tobias.gr, guix-patches@gnu.org Resent-Date: Wed, 10 Jan 2024 14:17:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 68266 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: 68266@debbugs.gnu.org Cc: Christopher Baines , Josselin Poiret , Ludovic =?UTF-8?Q?Court=C3=A8s?= , Mathieu Othacehe , Ricardo Wurmus , Simon Tournier , Tobias Geerinckx-Rice X-Debbugs-Original-Xcc: Christopher Baines , Josselin Poiret , Ludovic =?UTF-8?Q?Court=C3=A8s?= , Mathieu Othacehe , Ricardo Wurmus , Simon Tournier , Tobias Geerinckx-Rice Received: via spool by 68266-submit@debbugs.gnu.org id=B68266.17048961631254 (code B ref 68266); Wed, 10 Jan 2024 14:17:01 +0000 Received: (at 68266) by debbugs.gnu.org; 10 Jan 2024 14:16:03 +0000 Received: from localhost ([127.0.0.1]:39381 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rNZNC-0000JT-KS for submit@debbugs.gnu.org; Wed, 10 Jan 2024 09:16:03 -0500 Received: from mira.cbaines.net ([212.71.252.8]:43096) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rNZNA-0000EG-Bt for 68266@debbugs.gnu.org; Wed, 10 Jan 2024 09:16:00 -0500 Received: from localhost (unknown [217.155.61.229]) by mira.cbaines.net (Postfix) with ESMTPSA id ADD2F27BBE9 for <68266@debbugs.gnu.org>; Wed, 10 Jan 2024 12:57:24 +0000 (GMT) Received: from localhost (localhost [local]) by localhost (OpenSMTPD) with ESMTPA id 32e0a20a for <68266@debbugs.gnu.org>; Wed, 10 Jan 2024 12:57:24 +0000 (UTC) From: Christopher Baines Date: Wed, 10 Jan 2024 12:57:23 +0000 Message-ID: <89c875f974d1ad81ddd03f664ef08e397771d224.1704891443.git.mail@cbaines.net> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+larch=yhetil.org@gnu.org Sender: guix-patches-bounces+larch=yhetil.org@gnu.org X-Migadu-Flow: FLOW_IN X-Migadu-Country: US X-Migadu-Spam-Score: -3.31 X-Spam-Score: -3.31 X-Migadu-Queue-Id: 3876A1095B X-Migadu-Scanner: mx12.migadu.com X-TUID: e+KaFZ0uJQ7L This is intended to help with spotting duplication in the object cache, so where many keys, for example package records map to the same derivation. This represents an opportunity for improved performance if you can reduce this duplication in the cache, and better take advantage of the already present cache entries. I'm thinking this can be used by the data service, but maybe it could also be worked in to guix commands. * guix/store.scm (report-object-cache-duplication): New procedure. Change-Id: Ia6c816f871d10cae6807543224250110099d764f --- guix/store.scm | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 59 insertions(+) diff --git a/guix/store.scm b/guix/store.scm index 97c4f32a5b..86ca293cac 100644 --- a/guix/store.scm +++ b/guix/store.scm @@ -70,6 +70,7 @@ (define-module (guix store) current-store-protocol-version ;for internal use cache-lookup-recorder ;for internal use mcached + report-object-cache-duplication &store-error store-error? &store-connection-error store-connection-error? @@ -2037,6 +2038,64 @@ (define-syntax mcached ((_ mvalue object keys ...) (mcached eq? mvalue object keys ...)))) +(define* (report-object-cache-duplication store #:key (threshold 10) + (port (current-error-port))) + (define cache-values-to-keys + (make-hash-table)) + + (define (insert key val) + (hash-set! + cache-values-to-keys + key + (or (and=> (hash-ref cache-values-to-keys + key) + (lambda (existing-values) + (cons val existing-values))) + (list val)))) + + (let* ((cache-size + (vhash-fold + (lambda (key value result) + (match value + ((item . keys*) + (insert item key))) + + (+ 1 result)) + 0 + (store-connection-cache store %object-cache-id))) + (cached-values-by-key-count + (sort + (hash-map->list + (lambda (key value) + (cons key (length value))) + cache-values-to-keys) + (lambda (a b) + (< (cdr a) (cdr b)))))) + + (filter-map + (match-lambda + ((value . count) + (if (>= count threshold) + (begin + (when port + (simple-format port "value ~A cached ~A times\n" value count) + (simple-format port "example keys:\n")) + + (let ((keys (hash-ref cache-values-to-keys value))) + (when port + (for-each + (lambda (key) + (simple-format #t " - ~A\n" key)) + (if (> count 10) + (take keys 10) + keys)) + (newline port)) + + `((value . ,value) + (keys . ,keys)))) + #f))) + cached-values-by-key-count))) + (define (preserve-documentation original proc) "Return PROC with documentation taken from ORIGINAL." (set-object-property! proc 'documentation base-commit: e541f9593f8bfc84b6140c2408b393243289fae6 -- 2.41.0